Note: This section describes how to use SQLAlchemy to construct SQL queries and receive result sets. It does not cover the object relational mapping capabilities of SQLAlchemy; that is covered later on in None. However, both areas of functionality work similarly in how selection criterion is constructed, so if you are interested just in ORM, you should probably skim through basic None construction before moving on.
Once you have used the sqlalchemy.schema
module to construct your tables and/or reflect them from the database, performing SQL queries using those table meta data objects is done via the sqlalchemy.sql
package. This package defines a large set of classes, each of which represents a particular kind of lexical construct within a SQL query; all are descendants of the common base class sqlalchemy.sql.ClauseElement
. A full query is represented via a structure of ClauseElements. A set of reasonably intuitive creation functions is provided by the sqlalchemy.sql
package to create these structures; these functions are described in the rest of this section.
To execute a query, you create its structure, then call the resulting structure's execute()
method, which returns a cursor-like object (more on that later). The same clause structure can be used repeatedly. A ClauseElement is compiled into a string representation by an underlying SQLEngine object, which is located by searching through the clause's child items for a Table object, which provides a reference to its SQLEngine.
The examples below all include a dump of the generated SQL corresponding to the query object, as well as a dump of the statement's bind parameters. In all cases, bind parameters are named parameters using the colon format (i.e. ':name'). A named parameter scheme, either ':name' or '%(name)s', is used with all databases, including those that use positional schemes. For those, the named-parameter statement and its bind values are converted to the proper list-based format right before execution. Therefore a SQLAlchemy application that uses ClauseElements can standardize on named parameters for all databases.
For this section, we will assume the following tables:
from sqlalchemy import * db = create_engine('sqlite://filename=mydb', echo=True) # a table to store users users = Table('users', db, Column('user_id', Integer, primary_key = True), Column('user_name', String(40)), Column('password', String(80)) ) # a table that stores mailing addresses associated with a specific user addresses = Table('addresses', db, Column('address_id', Integer, primary_key = True), Column('user_id', Integer, ForeignKey("users.user_id")), Column('street', String(100)), Column('city', String(80)), Column('state', String(2)), Column('zip', String(10)) ) # a table that stores keywords keywords = Table('keywords', db, Column('keyword_id', Integer, primary_key = True), Column('name', VARCHAR(50)) ) # a table that associates keywords with users userkeywords = Table('userkeywords', db, Column('user_id', INT, ForeignKey("users")), Column('keyword_id', INT, ForeignKey("keywords")) )
A select is done by constructing a Select
object with the proper arguments, adding any extra arguments if desired, then calling its execute()
method.
from sqlalchemy import * # use the select() function defined in the sql package s = select([users]) # or, call the select() method off of a Table object s = users.select() # then, call execute on the Select object: sqlc = s.execute()
# the SQL text of any clause object can also be viewed via the str() call: >>> str(s) SELECT users.user_id, users.user_name, users.password FROM users
The object returned by the execute call is a sqlalchemy.engine.ResultProxy
object, which acts much like a DBAPI cursor
object in the context of a result set, except that the rows returned can address their columns by ordinal position, column name, or even column object:
# select rows, get resulting ResultProxy object sqlc = users.select().execute()
# get one row row = c.fetchone() # get the 'user_id' column via integer index: user_id = row[0] # or column name user_name = row['user_name'] # or column object password = row[users.c.password] # or column accessor password = row.password # ResultProxy object also supports fetchall() rows = c.fetchall() # or get the underlying DBAPI cursor object cursor = c.cursor
A common need when writing statements that reference multiple tables is to create labels for columns, thereby separating columns from different tables with the same name. The Select construct supports automatic generation of column labels via the use_labels=True
parameter:
sqlc = select([users, addresses], users.c.user_id==addresses.c.address_id, use_labels=True).execute()
The table name part of the label is affected if you use a construct such as a table alias:
person = users.alias('person') sqlc = select([person, addresses], person.c.user_id==addresses.c.address_id, use_labels=True).execute()
You can also specify custom labels on a per-column basis using the label()
function:
sqlc = select([users.c.user_id.label('id'), users.c.user_name.label('name')]).execute()
Calling select
off a table automatically generates a column clause which includes all the table's columns, in the order they are specified in the source Table object.
But in addition to selecting all the columns off a single table, any set of columns can be specified, as well as full tables, and any combination of the two:
The WHERE condition is the named keyword argument whereclause
, or the second positional argument to the select()
constructor and the first positional argument to the select()
method of Table
.
WHERE conditions are constructed using column objects, literal values, and functions defined in the sqlalchemy.sql
module. Column objects override the standard Python operators to provide clause compositional objects, which compile down to SQL operations:
sqlc = users.select(users.c.user_id == 7).execute()
Notice that the literal value "7" was broken out of the query and placed into a bind parameter. Databases such as Oracle must parse incoming SQL and create a "plan" when new queries are received, which is an expensive process. By using bind parameters, the same query with various literal values can have its plan compiled only once, and used repeatedly with less overhead.
More where clauses:
# another comparison operator sqlc = select([users], users.c.user_id>7).execute()
# OR keyword sqlc = users.select(or_(users.c.user_name=='jack', users.c.user_name=='ed')).execute()
# AND keyword sqlc = users.select(and_(users.c.user_name=='jack', users.c.password=='dog')).execute()
# NOT keyword sqlc = users.select(not_( or_(users.c.user_name=='jack', users.c.password=='dog') )).execute()
# IN clause sqlc = users.select(users.c.user_name.in_('jack', 'ed', 'fred')).execute()
# join users and addresses together sqlc = select([users, addresses], users.c.user_id==addresses.c.address_id).execute()
# join users and addresses together, but dont specify "addresses" in the # selection criterion. The WHERE criterion adds it to the FROM list # automatically. sqlc = select([users], and_( users.c.user_id==addresses.c.user_id, users.c.user_name=='fred' )).execute()
Select statements can also generate a WHERE clause based on the parameters you give it. If a given parameter, which matches the name of a column or its "label" (the combined tablename + "_" + column name), and does not already correspond to a bind parameter in the select object, it will be added as a comparison against that column. This is a shortcut to creating a full WHERE clause:
Supported column operators so far are all the numerical comparison operators, i.e. '==', '>', '>=', etc., as well as like(), startswith(), endswith(), between(), and in(). Boolean operators include not_(), and() and or(), which also can be used inline via '~', '&', and '|'. Math operators are '+', '-', '*', '/'. Any custom operator can be specified via the op() function shown below.
# "like" operator users.select(users.c.user_name.like('%ter')) # equality operator users.select(users.c.user_name == 'jane') # in opertator users.select(users.c.user_id.in_(1,2,3)) # and_, endswith, equality operators users.select(and_(addresses.c.street.endswith('green street'), addresses.c.zip=='11234')) # & operator subsituting for 'and_' users.select(addresses.c.street.endswith('green street') & (addresses.c.zip=='11234')) # + concatenation operator select([users.c.user_name + '_name']) # NOT operator users.select(~(addresses.c.street == 'Green Street')) # any custom operator select([users.c.user_name.op('||')('_category')])
For queries that don't contain any tables, the SQLEngine can be specified to any constructed statement via the engine
keyword parameter:
# select a literal select(["hi"], engine=myengine) # select a function select([func.now()], engine=db)
Functions can be specified using the func
keyword:
sqlselect([func.count(users.c.user_id)]).execute()
sqlusers.select(func.substr(users.c.user_name, 1) == 'J').execute()
Functions also are callable as standalone values:
# call the "now()" function time = func.now(engine=myengine).scalar() # call myfunc(1,2,3) myvalue = func.myfunc(1, 2, 3, engine=db).execute() # or call them off the engine db.func.now().scalar()
You can drop in a literal value anywhere there isnt a column to attach to via the literal
keyword:
sqlselect([literal('foo') + literal('bar'), users.c.user_name]).execute()
# literals have all the same comparison functions as columns sqlselect([literal('foo') == literal('bar')], engine=myengine).scalar()
Literals also take an optional type
parameter to give literals a type. This can sometimes be significant, for example when using the "+" operator with SQLite, the String type is detected and the operator is converted to "||":
sqlselect([literal('foo', type=String) + 'bar'], engine=e).execute()
The ORDER BY clause of a select statement can be specified as individual columns to order by within an array specified via the order_by
parameter, and optional usage of the asc() and desc() functions:
These are specified as keyword arguments:
sqlc = select([users.c.user_name], distinct=True).execute()
sqlc = users.select(limit=10, offset=20).execute()
The Oracle driver does not support LIMIT and OFFSET directly, but instead wraps the generated query into a subquery and uses the "rownum" variable to control the rows selected (this is somewhat experimental).
As some of the examples indicated above, a regular inner join can be implicitly stated, just like in a SQL expression, by just specifying the tables to be joined as well as their join conditions:
sqladdresses.select(addresses.c.user_id==users.c.user_id).execute()
There is also an explicit join constructor, which can be embedded into a select query via the from_obj
parameter of the select statement:
sqladdresses.select(from_obj=[ addresses.join(users, addresses.c.user_id==users.c.user_id) ]).execute()
The join constructor can also be used by itself:
sqljoin(users, addresses, users.c.user_id==addresses.c.user_id).select().execute()
The join criterion in a join() call is optional. If not specified, the condition will be derived from the foreign key relationships of the two tables. If no criterion can be constructed, an exception will be raised.
sqljoin(users, addresses).select().execute()
Notice that this is the first example where the FROM criterion of the select statement is explicitly specified. In most cases, the FROM criterion is automatically determined from the columns requested as well as the WHERE clause. The from_obj
keyword argument indicates a list of explicit FROM clauses to be used in the statement.
A join can be created on its own using the join
or outerjoin
functions, or can be created off of an existing Table or other selectable unit via the join
or outerjoin
methods:
Aliases are used primarily when you want to use the same table more than once as a FROM expression in a statement:
address_b = addresses.alias('addressb') sql# select users who have an address on Green street as well as Orange street users.select(and_( users.c.user_id==addresses.c.user_id, addresses.c.street.like('%Green%'), users.c.user_id==address_b.c.user_id, address_b.c.street.like('%Orange%') )).execute()
SQLAlchemy allows the creation of select statements from not just Table objects, but from a whole class of objects that implement the Selectable
interface. This includes Tables, Aliases, Joins and Selects. Therefore, if you have a Select, you can select from the Select:
">>>" s = users.select() ">>>" str(s) SELECT users.user_id, users.user_name, users.password FROM users ">>>" s = s.select() ">>>" str(s) SELECT user_id, user_name, password FROM (SELECT users.user_id, users.user_name, users.password FROM users)
Any Select, Join, or Alias object supports the same column accessors as a Table:
">>>" s = users.select() ">>>" [c.key for c in s.columns] ['user_id', 'user_name', 'password']
When you use use_labels=True
in a Select object, the label version of the column names become the keys of the accessible columns. In effect you can create your own "view objects":
s = select([users, addresses], users.c.user_id==addresses.c.user_id, use_labels=True) sqlselect([ s.c.users_user_name, s.c.addresses_street, s.c.addresses_zip ], s.c.addresses_city=='San Francisco').execute()
To specify a SELECT statement as one of the selectable units in a FROM clause, it usually should be given an alias.
sqls = users.select().alias('u') select([addresses, s]).execute()
Select objects can be used in a WHERE condition, in operators such as IN:
# select user ids for all users whos name starts with a "p" s = select([users.c.user_id], users.c.user_name.like('p%')) # now select all addresses for those users sqladdresses.select(addresses.c.user_id.in_(s)).execute()
The sql package supports embedding select statements into other select statements as the criterion in a WHERE condition, or as one of the "selectable" objects in the FROM list of the query. It does not at the moment directly support embedding a SELECT statement as one of the column criterion for a statement, although this can be achieved via direct text insertion, described later.
Subqueries can be used in the column clause of a select statement by specifying the scalar=True
flag:
sqlselect([table2.c.col1, table2.c.col2, select([table1.c.col1], table1.c.col2==7, scalar=True)])
When a select object is embedded inside of another select object, and both objects reference the same table, SQLAlchemy makes the assumption that the table should be correlated from the child query to the parent query. To disable this behavior, specify the flag correlate=False
to the Select statement.
# make an alias of a regular select. s = select([addresses.c.street], addresses.c.user_id==users.c.user_id).alias('s') >>> str(s) SELECT addresses.street FROM addresses, users WHERE addresses.user_id = users.user_id # now embed that select into another one. the "users" table is removed from # the embedded query's FROM list and is instead correlated to the parent query s2 = select([users, s.c.street]) >>> str(s2) SELECT users.user_id, users.user_name, users.password, s.street FROM users, (SELECT addresses.street FROM addresses WHERE addresses.user_id = users.user_id) s
An EXISTS clause can function as a higher-scaling version of an IN clause, and is usually used in a correlated fashion:
# find all users who have an address on Green street: sqlusers.select( exists( [addresses.c.address_id], and_( addresses.c.user_id==users.c.user_id, addresses.c.street.like('%Green%') ) ) )
Unions come in two flavors, UNION and UNION ALL, which are available via module level functions or methods off a Selectable:
sqlunion( addresses.select(addresses.c.street=='123 Green Street'), addresses.select(addresses.c.street=='44 Park Ave.'), addresses.select(addresses.c.street=='3 Mill Road'), order_by=[addresses.c.street] ).execute()
sqlusers.select( users.c.user_id==7 ).union_all( users.select( users.c.user_id==9 ), order_by=[users.c.user_id] # order_by is an argument to union_all() ).execute()
Throughout all these examples, SQLAlchemy is busy creating bind parameters wherever literal expressions occur. You can also specify your own bind parameters with your own names, and use the same statement repeatedly. As mentioned at the top of this section, named bind parameters are always used regardless of the type of DBAPI being used; for DBAPI's that expect positional arguments, bind parameters are converted to lists right before execution, and Pyformat strings in statements, i.e. '%(name)s', are converted to the appropriate positional style.
s = users.select(users.c.user_name==bindparam('username')) sqls.execute(username='fred')
sqls.execute(username='jane')
sqls.execute(username='mary')
executemany()
is also available, but that applies more to INSERT/UPDATE/DELETE, described later.
The generation of bind parameters is performed specific to the engine being used. The examples in this document all show "named" parameters like those used in sqlite and oracle. Depending on the parameter type specified by the DBAPI module, the correct bind parameter scheme will be used.
By throwing the compile()
method onto the end of any query object, the query can be "compiled" by the SQLEngine into a sqlalchemy.sql.Compiled
object just once, and the resulting compiled object reused, which eliminates repeated internal compilation of the SQL string:
s = users.select(users.c.user_name==bindparam('username')).compile() s.execute(username='fred') s.execute(username='jane') s.execute(username='mary')
The sql package tries to allow free textual placement in as many ways as possible. In the examples below, note that the from_obj parameter is used only when no other information exists within the select object with which to determine table metadata. Also note that in a query where there isnt even table metadata used, the SQLEngine to be used for the query has to be explicitly specified:
# strings as column clauses sqlselect(["user_id", "user_name"], from_obj=[users]).execute()
# strings for full column lists sqlselect( ["user_id, user_name, password, addresses.*"], from_obj=[users.alias('u'), addresses]).execute()
# functions, etc. sqlselect([users.c.user_id, "process_string(user_name)"]).execute()
# where clauses sqlusers.select(and_(users.c.user_id==7, "process_string(user_name)=27")).execute()
# subqueries sqlusers.select( "exists (select 1 from addresses where addresses.user_id=users.user_id)").execute()
# custom FROM objects sqlselect( ["*"], from_obj=["(select user_id, user_name from users)"], engine=db).execute()
# a full query {sql}text("select user_name from users", engine=db).execute() select user_name from users {} # or call text() off of the engine engine.text("select user_name from users").execute() # execute off the engine directly - you must use the engine's native bind parameter # style (i.e. named, pyformat, positional, etc.) {sql}db.execute( "select user_name from users where user_id=:user_id", {'user_id':7}).execute() select user_name from users where user_id=:user_id {'user_id':7}
Use the format ':paramname'
to define bind parameters inside of a text block. They will be converted to the appropriate format upon compilation:
t = engine.text("select foo from mytable where lala=:hoho") r = t.execute(hoho=7)
Bind parameters can also be explicit, which allows typing information to be added. Just specify them as a list with keys that match those inside the textual statement:
t = engine.text("select foo from mytable where lala=:hoho", bindparams=[bindparam('hoho', type=types.String)]) r = t.execute(hoho="im hoho")
Result-row type processing can be added via the typemap
argument, which is a dictionary of return columns mapped to types:
# specify DateTime type for the 'foo' column in the result set # sqlite, for example, uses result-row post-processing to construct dates t = engine.text("select foo from mytable where lala=:hoho", bindparams=[bindparam('hoho', type=types.String)], typemap={'foo':types.DateTime} ) r = t.execute(hoho="im hoho") # 'foo' is a datetime year = r.fetchone()['foo'].year
One of the primary motivations for a programmatic SQL library is to allow the piecemeal construction of a SQL statement based on program variables. All the above examples typically show Select objects being created all at once. The Select object also includes "builder" methods to allow building up an object. The below example is a "user search" function, where users can be selected based on primary key, user name, street address, keywords, or any combination:
def find_users(id=None, name=None, street=None, keywords=None): statement = users.select() if id is not None: statement.append_whereclause(users.c.user_id==id) if name is not None: statement.append_whereclause(users.c.user_name==name) if street is not None: # append_whereclause joins "WHERE" conditions together with AND statement.append_whereclause(users.c.user_id==addresses.c.user_id) statement.append_whereclause(addresses.c.street==street) if keywords is not None: statement.append_from( users.join(userkeywords, users.c.user_id==userkeywords.c.user_id).join( keywords, userkeywords.c.keyword_id==keywords.c.keyword_id)) statement.append_whereclause(keywords.c.name.in_(keywords)) # to avoid multiple repeats, set query to be DISTINCT: statement.distinct=True return statement.execute() sqlfind_users(id=7)
sqlfind_users(street='123 Green Street')
sqlfind_users(name='Jack', keywords=['jack','foo'])
An INSERT involves just one table. The Insert object is used via the insert() function, and the specified columns determine what columns show up in the generated SQL. If primary key columns are left out of the criterion, the SQL generator will try to populate them as specified by the particular database engine and sequences, i.e. relying upon an auto-incremented column or explicitly calling a sequence beforehand. Insert statements, as well as updates and deletes, can also execute multiple parameters in one pass via specifying an array of dictionaries as parameters.
The values to be populated for an INSERT or an UPDATE can be specified to the insert()/update() functions as the values
named argument, or the query will be compiled based on the values of the parameters sent to the execute() method.
# basic insert sqlusers.insert().execute(user_id=1, user_name='jack', password='asdfdaf')
# insert just user_name, NULL for others # will auto-populate primary key columns if they are configured # to do so sqlusers.insert().execute(user_name='ed')
# INSERT with a list: sqlusers.insert(values=(3, 'jane', 'sdfadfas')).execute()
# INSERT with user-defined bind parameters i = users.insert( values={'user_name':bindparam('name'), 'password':bindparam('pw')} ) sqli.execute(name='mary', pw='adas5fs')
# INSERT many - if no explicit 'values' parameter is sent, # the first parameter list in the list determines # the generated SQL of the insert (i.e. what columns are present) # executemany() is used at the DBAPI level sqlusers.insert().execute( {'user_id':7, 'user_name':'jack', 'password':'asdfasdf'}, {'user_id':8, 'user_name':'ed', 'password':'asdffcadf'}, {'user_id':9, 'user_name':'fred', 'password':'asttf'}, )
Updates work a lot like INSERTS, except there is an additional WHERE clause that can be specified.
# change 'jack' to 'ed' sqlusers.update(users.c.user_name=='jack').execute(user_name='ed')
# use bind parameters u = users.update(users.c.user_name==bindparam('name'), values={'user_name':bindparam('newname')}) sqlu.execute(name='jack', newname='ed')
# update a column to another column sqlusers.update(values={users.c.password:users.c.user_name}).execute()
# multi-update sqlusers.update(users.c.user_id==bindparam('id')).execute( {'id':7, 'user_name':'jack', 'password':'fh5jks'}, {'id':8, 'user_name':'ed', 'password':'fsr234ks'}, {'id':9, 'user_name':'mary', 'password':'7h5jse'}, )
A correlated update lets you update a table using selection from another table, or the same table:
s = select([addresses.c.city], addresses.c.user_id==users.c.user_id) sqlusers.update( and_(users.c.user_id>10, users.c.user_id<20), values={users.c.user_name:s} ).execute()
A delete is formulated like an update, except theres no values:
users.delete(users.c.user_id==7).execute() users.delete(users.c.user_name.like(bindparam('name'))).execute( {'name':'%Jack%'}, {'name':'%Ed%'}, {'name':'%Jane%'}, ) users.delete(exists())