Data Queries
There are two fundamental types of AQL queries:
- queries which access data (read documents)
- queries which modify data (create, update, replace, delete documents)
Data Access Queries
Retrieving data from the database with AQL does always include a RETURN operation. It can be used to return a static value, such as a string:
RETURN "Hello ArangoDB!"
The query result is always an array of elements, even if a single element was
returned and contains a single element in that case: ["Hello ArangoDB!"]
The function DOCUMENT()
can be called to retrieve a single document via
its document handle, for instance:
RETURN DOCUMENT("users/phil")
RETURN
is usually accompanied by a FOR loop to iterate over the
documents of a collection. The following query executes the loop body for all
documents of a collection called users. Each document is returned unchanged
in this example:
FOR doc IN users
RETURN doc
Instead of returning the raw doc
, one can easily create a projection:
FOR doc IN users
RETURN { user: doc, newAttribute: true }
For every user document, an object with two attributes is returned. The value of the attribute user is set to the content of the user document, and newAttribute is a static attribute with the boolean value true.
Operations like FILTER, SORT and LIMIT can be added to the loop body
to narrow and order the result. Instead of above shown call to DOCUMENT()
,
one can also retrieve the document that describes user phil like so:
FOR doc IN users
FILTER doc._key == "phil"
RETURN doc
The document key is used in this example, but any other attribute could equally be used for filtering. Since the document key is guaranteed to be unique, no more than a single document will match this filter. For other attributes this may not be the case. To return a subset of active users (determined by an attribute called status), sorted by name in ascending order, you can do:
FOR doc IN users
FILTER doc.status == "active"
SORT doc.name
LIMIT 10
Note that operations do not have to occur in a fixed order and that their order
can influence the result significantly. Limiting the number of documents
before a filter is usually not what you want, because it easily misses a lot
of documents that would fulfill the filter criterion, but are ignored because
of a premature LIMIT
clause. Because of the aforementioned reasons, LIMIT
is usually put at the very end, after FILTER
, SORT
and other operations.
See the High Level Operations chapter for more details.
Data Modification Queries
AQL supports the following data-modification operations:
- INSERT: insert new documents into a collection
- UPDATE: partially update existing documents in a collection
- REPLACE: completely replace existing documents in a collection
- REMOVE: remove existing documents from a collection
- UPSERT: conditionally insert or update documents in a collection
Below you find some simple example queries that use these operations. The operations are detailed in the chapter High Level Operations.
Modifying a single document
Let’s start with the basics: INSERT
, UPDATE
and REMOVE
operations on single documents.
Here is an example that insert a document in an existing collection users:
INSERT {
firstName: "Anna",
name: "Pavlova",
profession: "artist"
} IN users
You may provide a key for the new document; if not provided, ArangoDB will create one for you.
INSERT {
_key: "GilbertoGil",
firstName: "Gilberto",
name: "Gil",
city: "Fortalezza"
} IN users
As ArangoDB is schema-free, attributes of the documents may vary:
INSERT {
_key: "PhilCarpenter",
firstName: "Phil",
name: "Carpenter",
middleName: "G.",
status: "inactive"
} IN users
INSERT {
_key: "NatachaDeclerck",
firstName: "Natacha",
name: "Declerck",
location: "Antwerp"
} IN users
Update is quite simple. The following AQL statement will add or change the attributes status and location
UPDATE "PhilCarpenter" WITH {
status: "active",
location: "Beijing"
} IN users
Replace is an alternative to update where all attributes of the document are replaced.
REPLACE {
_key: "NatachaDeclerck",
firstName: "Natacha",
name: "Leclerc",
status: "active",
level: "premium"
} IN users
Removing a document if you know its key is simple as well :
REMOVE "GilbertoGil" IN users
or
REMOVE { _key: "GilbertoGil" } IN users
Modifying multiple documents
Data-modification operations are normally combined with FOR
loops to
iterate over a given list of documents. They can optionally be combined with
FILTER
statements and the like.
Let’s start with an example that modifies existing documents in a collection users that match some condition:
FOR u IN users
FILTER u.status == "not active"
UPDATE u WITH { status: "inactive" } IN users
Now, let’s copy the contents of the collection users into the collection backup:
FOR u IN users
INSERT u IN backup
Subsequently, let’s find some documents in collection users and remove them from collection backup. The link between the documents in both collections is established via the documents’ keys:
FOR u IN users
FILTER u.status == "deleted"
REMOVE u IN backup
The following example will remove all documents from both users and backup:
LET r1 = (FOR u IN users REMOVE u IN users)
LET r2 = (FOR u IN backup REMOVE u IN backup)
RETURN true
Returning documents
Data-modification queries can optionally return documents. In order to reference
the inserted, removed or modified documents in a RETURN
statement, data-modification
statements introduce the OLD
and/or NEW
pseudo-values:
FOR i IN 1..100
INSERT { value: i } IN test
RETURN NEW
FOR u IN users
FILTER u.status == "deleted"
REMOVE u IN users
RETURN OLD
FOR u IN users
FILTER u.status == "not active"
UPDATE u WITH { status: "inactive" } IN users
RETURN NEW
NEW
refers to the inserted or modified document revision, and OLD
refers
to the document revision before update or removal. INSERT
statements can
only refer to the NEW
pseudo-value, and REMOVE
operations only to OLD
.
UPDATE
, REPLACE
and UPSERT
can refer to either.
In all cases the full documents will be returned with all their attributes,
including the potentially auto-generated attributes such as _id
, _key
, or _rev
and the attributes not specified in the update expression of a partial update.
Projections
It is possible to return a projection of the documents in OLD
or NEW
instead of
returning the entire documents. This can be used to reduce the amount of data returned
by queries.
For example, the following query will return only the keys of the inserted documents:
FOR i IN 1..100
INSERT { value: i } IN test
RETURN NEW._key
Using OLD and NEW in the same query
For UPDATE
, REPLACE
and UPSERT
statements, both OLD
and NEW
can be used
to return the previous revision of a document together with the updated revision:
FOR u IN users
FILTER u.status == "not active"
UPDATE u WITH { status: "inactive" } IN users
RETURN { old: OLD, new: NEW }
Calculations with OLD or NEW
It is also possible to run additional calculations with LET
statements between the
data-modification part and the final RETURN
of an AQL query. For example, the following
query performs an upsert operation and returns whether an existing document was
updated, or a new document was inserted. It does so by checking the OLD
variable
after the UPSERT
and using a LET
statement to store a temporary string for
the operation type:
UPSERT { name: "test" }
INSERT { name: "test" }
UPDATE { } IN users
LET opType = IS_NULL(OLD) ? "insert" : "update"
RETURN { _key: NEW._key, type: opType }
Restrictions
The name of the modified collection (users and backup in the above cases) must be known to the AQL executor at query-compile time and cannot change at runtime. Using a bind parameter to specify the collection name is allowed.
It is not possible to use multiple data-modification operations for the same collection in the same query, or follow up a data-modification operation for a specific collection with a read operation for the same collection. Neither is it possible to follow up any data-modification operation with a traversal query (which may read from arbitrary collections not necessarily known at the start of the traversal).
That means you may not place several REMOVE
or UPDATE
statements for the same
collection into the same query. It is however possible to modify different collections
by using multiple data-modification operations for different collections in the
same query.
In case you have a query with several places that need to remove documents from the
same collection, it is recommended to collect these documents or their keys in an array
and have the documents from that array removed using a single REMOVE
operation.
Data-modification operations can optionally be followed by LET
operations to
perform further calculations and a RETURN
operation to return data.
Transactional Execution
On a single server, data-modification operations are executed transactionally. If a data-modification operation fails, any changes made by it will be rolled back automatically as if they never happened.
If the RocksDB engine is used and intermediate commits are enabled, a query may execute intermediate transaction commits in case the running transaction (AQL query) hits the specified size thresholds. In this case, the query’s operations carried out so far will be committed and not rolled back in case of a later abort/rollback. That behavior can be controlled by adjusting the intermediate commit settings for the RocksDB engine.
In a cluster, AQL data-modification queries are currently not executed transactionally. Additionally, update, replace, upsert and remove AQL queries currently require the _key attribute to be specified for all documents that should be modified or removed, even if a shared key attribute other than _key was chosen for the collection. This restriction may be overcome in a future release of ArangoDB.