==================
SQLAlchemy Support
==================

Since most people prefer to use an ORM the `crate` client library provides
support for SQLAlchemy out of the box.

The Dialect is installed and registered by default (e.g. if `crate` was
installed using pip) and can be used without further configuration.

Connection String
-----------------

In SQLAlchemy a connection is established using the `create_engine` function.
This function takes a connection string that varies from database to database.

In order to connect to a crate cluster the following connection strings are
valid::

    >>> sa.create_engine('crate://')
    Engine(crate://)

This will connect to the default server ('127.0.0.1:4200'). In order to
connect to a different server the following syntax can be used::

    >>> sa.create_engine('crate://otherserver:4200')
    Engine(crate://otherserver:4200)

Since Crate is a clustered database that usually consists of multiple server it
is recommended to connect to all of them. This enables the DB-API layer to use
round-robin to distribute the load and skip a server if it becomes unavailable.

The `connect_args` parameter has to be used to do so::

    >>> sa.create_engine('crate://', connect_args={
    ...     'servers': ['host1:4200', 'host2:4200']
    ... })
    Engine(crate://)


Supported Operations and Limitations
====================================

Currently Crate only implements a subset of the `SQL` standard. Therefore many
operations that SQLAlchemy provides are currently not supported.

An overview of supported operations::

 - Simple select statements including filter operations, limit and offset,
   group by and order by. But without joins or subselects
 - Inserting new rows.
 - Updating rows.

Unlike other databases that are usually used with SQLAlchemy, Crate isn't a
RDBMS but instead is document oriented with `eventual consistency
<http://en.wikipedia.org/wiki/Eventual_consistency>`_.

This means that there is no transaction support and the database should be
modeled without relationships.

In SQLAlchemy a `Session` object is used to query the database. This `Session`
object contains a `rollback` method that in `Crate` won't do anything at all.
Similar the `commit` method will only `flush` but not actually commit, as there
is no commit in crate.

Please refer to the `SQLAlchemy documentation
<http://docs.sqlalchemy.org/en/rel_0_8/orm/tutorial.html#adding-new-objects>`_
for more information about the session management and the concept of `flush`.

Complex Types
=============

In a document oriented database it is a common pattern to store complex objects
within a single field. For such cases the `crate` package provides the `Object`
type. The `Object` type can be seen as some kind of dictionary or map like type.

Below is a schema definition using `SQLAlchemy's declarative approach
<http://docs.sqlalchemy.org/en/rel_0_8/orm/extensions/declarative.html>`_::

    >>> from crate.client.sqlalchemy.types import Object, ObjectArray
    >>> from uuid import uuid4

    >>> def gen_key():
    ...     return str(uuid4())

    >>> class Character(Base):
    ...     __tablename__ = 'characters'
    ...     id = sa.Column(sa.String, primary_key=True, default=gen_key)
    ...     name = sa.Column(sa.String)
    ...     details = sa.Column(Object)
    ...     more_details = sa.Column(ObjectArray)

Using the `Session
<http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html?highlight=session#>`_
two characters are added that have additional attributes inside the `details`
column that weren't defined in the schema::

    >>> arthur = Character(name='Arthur Dent')
    >>> arthur.details = {}
    >>> arthur.details['gender'] = 'male'
    >>> arthur.details['species'] = 'human'
    >>> session.add(arthur)

    >>> trillian = Character(name='Tricia McMillan')
    >>> trillian.details = {}
    >>> trillian.details['gender'] = 'female'
    >>> trillian.details['species'] = 'human'
    >>> trillian.details['female_only_attribute'] = 1
    >>> session.add(trillian)
    >>> session.commit()

After `INSERT` statements are sent to the database the newly inserted rows
aren't immediately available for search because the index is only updated
periodically::

    >>> from time import sleep
    >>> sleep(2)

.. note::

    Newly inserted rows can still be queried immediately if a lookup by the
    primary key is done.

A regular select query will then fetch the whole documents::

    >>> query = session.query(Character).order_by(Character.name)
    >>> [(c.name, c.details['gender']) for c in query]
    [('Arthur Dent', 'male'), ('Tricia McMillan', 'female')]

But it is also possible to just select a part of the document, even inside the
`Object` type::

    >>> sorted(session.query(Character.details['gender']).all())
    [('female',), ('male',)]

In addition, filtering on the attributes inside the `details` column is also
possible::

    >>> query = session.query(Character.name)
    >>> query.filter(Character.details['gender'] == 'male').all()
    [('Arthur Dent',)]

Updating Complex Types
----------------------

The SQLAlchemy Crate dialect supports change tracking deep down the nested
levels of a `Object` type field. For example the following query will only
update the `gender` key. The `species` key which is on the same level will be
left untouched.

    >>> char = session.query(Character).filter_by(name='Arthur Dent').one()
    >>> char.details['gender'] = 'manly man'
    >>> session.commit()
    >>> session.refresh(char)

    >>> char.details['gender']
    u'manly man'
    
    >>> char.details['species']
    u'human'

Object Array
------------

in addition to the `Object` type the Crate sqlalchemy dialect also includes a
type called `ObjectArray`. This type maps to a Python list of dictionaries.

Note that opposed to the `Object` type the `ObjectArray` type isn't smart and
doesn't have an intelligent change tracking. Therefore the generated UPDATE
statement will affect the whole list::

    >>> char.more_details = [{'foo': 1, 'bar': 10}, {'foo': 2}]
    >>> session.commit()

    >>> char.more_details.append({'foo': 3})
    >>> session.commit()

This will generate a UPDATE statement roughly like this::

    "UPDATE characters set more_details = ? ...", ([{'foo': 1, 'bar': 10}, {'foo': 2}, {'foo': 3}],)

    >>> refresh("characters")

If a query is done on a element inside a dictionary only one of the
dictionaries inside the dictionaries has to match in order for the result to be
returned::

    >>> query = session.query(Character.name)
    >>> query.filter(Character.more_details['foo'] == 1).all()
    [(u'Arthur Dent',)]


Count and Group By
==================

SQLAlchemy supports different approaches to issue a query with a count
aggregate function. Take a look at the `Counting section in the tutorial
<http://docs.sqlalchemy.org/en/rel_0_8/orm/tutorial.html#counting>`_ for a full
overview.

Crate currently doesn't support all variants as it can't handle the sub-queries
yet.

This means that queries with count have to be written in one of the following
ways::

    >>> session.query(sa.func.count(Character.id)).scalar()
    2

    >>> session.query(sa.func.count('*')).select_from(Character).scalar()
    2

.. note::

    The column that is passed to the count method has to be the primary key.
    Other columns won't work.

Using the `group_by` clause is similar::

    >>> session.query(sa.func.count(Character.id), Character.name) \
    ...     .group_by(Character.name) \
    ...     .order_by(sa.desc(sa.func.count(Character.id))) \
    ...     .order_by(Character.name).all()
    [(1, u'Arthur Dent'), (1, u'Tricia McMillan')]
