ArangoDB v3.4 reached End of Life (EOL) and is no longer supported.

This documentation is outdated. Please see the most recent version here: Latest Docs

Creating test data with AQL

Problem

I want to create some test documents.

Solution

If you haven’t yet created a collection to hold the documents, create one now using the ArangoShell:

db._create("myCollection");

This has created a collection named myCollection.

One of the easiest ways to fill a collection with test data is to use an AQL query that iterates over a range.

Run the following AQL query from the AQL editor in the web interface to insert 1,000 documents into the just created collection:

FOR i IN 1..1000
  INSERT { name: CONCAT("test", i) } IN myCollection

The number of documents to create can be modified easily be adjusting the range boundary values.

To create more complex test data, adjust the AQL query!

Let’s say we also want a status attribute, and fill it with integer values between 1 to (including) 5, with equal distribution. A good way to achieve this is to use the modulo operator (%):

FOR i IN 1..1000
  INSERT {
    name: CONCAT("test", i),
    status: 1 + (i % 5)
  } IN myCollection

To create pseudo-random values, use the RAND() function. It creates pseudo-random numbers between 0 and 1. Use some factor to scale the random numbers, and FLOOR() to convert the scaled number back to an integer.

For example, the following query populates the value attribute with numbers between 100 and 150 (including):

FOR i IN 1..1000
  INSERT {
    name: CONCAT("test", i),
    value: 100 + FLOOR(RAND() * (150 - 100 + 1))
  } IN myCollection

After the test data has been created, it is often helpful to verify it. The RAND() function is also a good candidate for retrieving a random sample of the documents in the collection. This query will retrieve 10 random documents:

FOR doc IN myCollection
  SORT RAND()
  LIMIT 10
  RETURN doc

The COLLECT clause is an easy mechanism to run an aggregate analysis on some attribute. Let’s say we wanted to verify the data distribution inside the status attribute. In this case we could run:

FOR doc IN myCollection
  COLLECT value = doc.value WITH COUNT INTO count
  RETURN {
    value: value,
    count: count
  }

The above query will provide the number of documents per distinct value.

Author: Jan Steemann

Tags: #aql