A GQL interface to the datastore.
GQL is a SQL-like language which supports more object-like semantics
in a language that is familiar to SQL users. The language supported by
GQL will change over time, but will start off with fairly simple
semantics.
- reserved words are case insensitive
- names are case sensitive
The syntax for SELECT is fairly straightforward:
SELECT [[DISTINCT] <property> [, <property> ...] | * | __key__ ]
[FROM <entity>]
[WHERE <condition> [AND <condition> ...]]
[ORDER BY <property> [ASC | DESC] [, <property> [ASC | DESC] ...]]
[LIMIT [<offset>,]<count>]
[OFFSET <offset>]
[HINT (ORDER_FIRST | FILTER_FIRST | ANCESTOR_FIRST)]
[;]
<condition> := <property> {< | <= | > | >= | = | != | IN} <value>
<condition> := <property> {< | <= | > | >= | = | != | IN} CAST(<value>)
<condition> := <property> IN (<value>, ...)
<condition> := ANCESTOR IS <entity or key>
Currently the parser is LL(1) because of the simplicity of the grammer
(as it is largely predictive with one token lookahead).
The class is implemented using some basic regular expression tokenization
to pull out reserved tokens and then the recursive descent parser will act
as a builder for the pre-compiled query. This pre-compiled query is then
bound to arguments before executing the query.
Initially, three parameter passing mechanisms are supported when calling
Execute():
- Positional parameters
Execute('SELECT * FROM Story WHERE Author = :1 AND Date > :2')
- Named parameters
Execute('SELECT * FROM Story WHERE Author = :author AND Date > :date')
- Literals (numbers, strings, booleans, and NULL)
Execute('SELECT * FROM Story WHERE Author = \'James\'')
Users are also given the option of doing type conversions to other datastore
types (e.g. db.Email, db.GeoPt). The language provides a conversion function
which allows the caller to express conversions of both literals and
parameters. The current conversion operators are:
- GEOPT(float, float)
- USER(str)
- KEY(kind, id/name[, kind, id/name...])
- DATETIME(year, month, day, hour, minute, second)
- DATETIME('YYYY-MM-DD HH:MM:SS')
- DATE(year, month, day)
- DATE('YYYY-MM-DD')
- TIME(hour, minute, second)
- TIME('HH:MM:SS')
We will properly serialize and quote all values.
It should also be noted that there are some caveats to the queries that can
be expressed in the syntax. The parser will attempt to make these clear as
much as possible, but some of the caveats include:
- There is no OR operation. In most cases, you should prefer to use IN to
express the idea of wanting data matching one of a set of values.
- You cannot express inequality operators on multiple different properties
- You can only have one != operator per query (related to the previous
rule).
- The IN and != operators must be used carefully because they can
dramatically raise the amount of work done by the datastore. As such,
there is a limit on the number of elements you can use in IN statements.
This limit is set fairly low. Currently, a max of 30 datastore queries is
allowed in a given GQL query. != translates into 2x the number of
datastore queries, and IN multiplies by the number of elements in the
clause (so having two IN clauses, one with 5 elements, the other with 6
will cause 30 queries to occur).
- Literals can take the form of basic types or as type-cast literals. On
the other hand, literals within lists can currently only take the form of
simple types (strings, integers, floats).
SELECT * will return an iterable set of entities; SELECT __key__ will return
an iterable set of Keys.