You are viewing the documentation for Pilosa v1.0. View the latest documentation for Pilosa v1.1.0.

Query Language

Overview

This section will provide a detailed reference and examples for the Pilosa Query Language (PQL). All PQL queries operate on a single index and are passed to Pilosa through the /index/INDEX_NAME/query endpoint. You may pass multiple PQL queries in a single request by simply concatenating the queries together - a space is not needed. The results format is always:

{"results":[...]}

There will be one item in the results array for each PQL query in the request. The type of each item in the array will depend on the type of query - each query in the reference below lists its result type.

Conventions

Examples

Before running any of the example queries below, follow the instructions in the Getting Started section to set up an index and fields, and to populate them with some data.

The examples just show the PQL quer(ies) needed - to run the query Set(10, stargazer=1) against a server using curl, you would:

curl localhost:10101/index/repository/query \
     -X POST \
     -d 'Set(10, stargazer=1)'
{"results":[true]}

Arguments and Types

Write Operations

Set

Spec:

Set(<COLUMN>, <FIELD>=<ROW>, [TIMESTAMP])

Description:

Set assigns a value of 1 to a bit in the binary matrix, thus associating the given row (the <ROW> value) in the given field with the given column.

Result Type: boolean

A return value of true indicates that the bit was changed to 1.

A return value of false indicates that the bit was already set to 1 and nothing changed.

Examples:

Set the bit at row 1, column 10:

Set(10, stargazer=1)
{"results":[true]}

This sets a bit in the stargazer field, representing that the user with id=1 has starred the repository with id=10.

Set also supports providing a timestamp. To write the date that a user starred a repository:

Set(10, stargazer=1, 2016-01-01T00:00)
{"results":[true]}

Set multiple bits in a single request:

Set(1, stargazer=10) Set(2, stargazer=10) Set(1, stargazer=20) Set(2, stargazer=30)
{"results":[false,true,true,true]}

Set the field “pullrequests” to integer value 2 at column 10:

Set(10, pullrequests=2)
{"results":[true]}

SetRowAttrs

Spec:

SetRowAttrs(<FIELD>, <ROW>, 
            <ATTR_NAME=ATTR_VALUE>, 
            [ATTR_NAME=ATTR_VALUE ...])

Description:

SetRowAttrs associates arbitrary key/value pairs with a row in a field. Setting a value of null, without quotes, deletes an attribute.

Result Type: null

SetRowAttrs queries always return null upon success.

Examples:

Set attributes username and active on row 10:

SetRowAttrs(stargazer, 10, username="mrpi", active=true)
{"results":[null]}

Set username value and active status for user 10. These are arbitrary key/value pairs which have no meaning to Pilosa. You can see the attributes you’ve set on a row with a Row query like so Row(stargazer=10).

Delete attribute username on row 10:

SetRowAttrs(stargazer, 10, username=null)
{"results":[null]}

SetColumnAttrs

Spec:

SetColumnAttrs(<COLUMN>, 
               <ATTR_NAME=ATTR_VALUE>, 
               [ATTR_NAME=ATTR_VALUE ...])

Description:

SetColumnAttrs associates arbitrary key/value pairs with a column in an index.

Result Type: null

SetColumnAttrs queries always return null upon success. Setting a value of null, without quotes, deletes an attribute.

Examples:

Set attributes stars, url, and active on column 10:

SetColumnAttrs(10, stars=123, url="http://projects.pilosa.com/10", active=true)
{"results":[null]}

Set url value and active status for project 10. These are arbitrary key/value pairs which have no meaning to Pilosa.

ColumnAttrs can be requested by adding the URL parameter columnAttrs=true to a query. For example:

curl localhost:10101/index/repository/query?columnAttrs=true -XPOST -d 'Row(stargazer=1) Row(stargazer=2)'
{
  "results":[
    {"attrs":{},"cols":[10,20]},
    {"attrs":{},"cols":[10,30]}
  ],
  "columnAttrs":[
    {"id":10,"attrs":{"active":true,"stars":123,"url":"http://projects.pilosa.com/10"}},
    {"id":20,"attrs":{"active":false,"stars":456,"url":"http://projects.pilosa.com/30"}}
  ]
}

In this example, ColumnAttrs have been set on columns 10 and 20, but not column 30. The relevant attributes are all returned in a single columnAttrs list. See the query index section for more information.

Delete the url attribute on column 10:

SetColumnAttrs(10, url=null)
{"results":[null]}

Clear

Spec:

Clear(<COLUMN>, <FIELD>=<ROW>)

Description:

Clear assigns a value of 0 to a bit in the binary matrix, thus disassociating the given row in the given field from the given column.

Note that clearing a column on a time field will remove all data for that column.

Result Type: boolean

A return value of true indicates that the bit was toggled from 1 to 0.

A return value of false indicates that the bit was already set to 0 and nothing changed.

Examples:

Clear the bit at row 1 and column 10 in the stargazer field:

Clear(10, stargazer=1)
{"results":[true]}

This represents removing the relationship between the user with id=1 and the repository with id=10.

Read Operations

Row

Spec:

Row(<FIELD>=<ROW>)

Description:

Row retrieves the indices of all the columns in a row. It also retrieves any attributes set on that row.

Result Type: object with attrs and columns.

e.g. {"attrs":{"username":"mrpi","active":true},"columns":[10, 20]}

Examples:

Query all columns with a bit set in row 1 of the field stargazer (repositories that are starred by user 1):

Row(stargazer=1)
{"attrs":{"username":"mrpi","active":true},"columns":[10, 20]}

Union

Spec:

Union([ROW_CALL ...])

Description:

Union performs a logical OR on the results of all ROW_CALL queries passed to it.

Result Type: object with attrs and bits

attrs will always be empty

Examples:

Query columns with a bit set in either of two rows (repositories that are starred by either of two users):

Union(Row(stargazer=1), Row(stargazer=2))
{"attrs":{},"columns":[10, 20, 30]}

Intersect

Spec:

Intersect(<ROW_CALL>, [ROW_CALL ...])

Description:

Intersect performs a logical AND on the results of all ROW_CALL queries passed to it.

Result Type: object with attrs and columns

attrs will always be empty

Examples:

Query columns with a bit set in both of two rows (repositories that are starred by both of two users):

Intersect(Row(stargazer=1), Row(stargazer=2))
{"attrs":{},"columns":[10]}

Difference

Spec:

Difference(<ROW_CALL>, [ROW_CALL ...])

Description:

Difference returns all of the bits from the first ROW_CALL argument passed to it, without the bits from each subsequent ROW_CALL.

Result Type: object with attrs and columns

attrs will always be empty

Examples:

Query columns with a bit set in one row and not another (repositories that are starred by one user and not another):

Difference(Row(stargazer=1), Row(stargazer=2))
{"results":[{"attrs":{},"columns":[20]}]}

Query for the opposite difference:

Difference(Row(stargazer=2), Row(stargazer=1))
{"attrs":{},"columns":[30]}

Xor

Spec:

Xor(<ROW_CALL>, [ROW_CALL ...])

Description:

Xor performs a logical XOR on the results of each ROW_CALL query passed to it.

Result Type: object with attrs and columns

attrs will always be empty

Examples:

Query columns with a bit set in exactly one of two rows (repositories that are starred by only one of two users):

Xor(Row(stargazer=2), Row(stargazer=1))
{"results":[{"attrs":{},"columns":[10,20,30]}]}

Count

Spec:

Count(<ROW_CALL>)

Description:

Returns the number of set bits in the ROW_CALL passed in.

Result Type: int

Examples:

Query the number of bits set in a row (the number of repositories a user has starred):

Count(Row(stargazer=1))
{"results":[1]}

TopN

Spec:

TopN(<FIELD>, [ROW_CALL], [n=UINT],
     [attrName=<ATTR_NAME>, attrValues=<[]ATTR_VALUE>])

Description:

Return the id and count of the top n rows (by count of bits) in the field. The attrName and attrValues arguments work together to only return rows which have the attribute specified by attrName with one of the values specified in attrValues.

Result Type: array of key/count objects

Caveats:

See field creation for more information about the cache.

Examples:

Basic TopN query:

TopN(stargazer)
{"results":[[{"id":1240,"count":102},{"id":4734,"count":100},{"id":12709,"count":93},...]]}

Limit the number of results:

TopN(stargazer, n=2)
{"results":[[{"id":1240,"count":102},{"id":4734,"count":100}]]}

Filter based on an existing row:

TopN(stargazer, Row(language=1), n=2)
{"results":[[{"id":1240,"count":35},{"id":7508,"count":32}]]}

Filter based on attributes:

TopN(stargazer, n=2, attrName=active, attrValues=[true])
{"results":[[{"id":10,"count":1},{"id":13,"count":1}]]}

Range Queries

Spec:

Range(<FIELD>=<ROW>, <TIMESTAMP>, <TIMESTAMP>)

Description:

Similar to Row, but only returns bits which were set with timestamps between the given start (first) and end (second) timestamps.

Result Type: object with attrs and bits

Examples:

Query all columns with a bit set in row 1 of a field (repositories that a user has starred), within a date range:

Range(stargazer=1, 2010-01-01T00:00, 2017-03-02T03:00)
{{"attrs":{},"columns":[10]}

This example assumes timestamps have been set on some bits.

Range (BSI)

Spec:

Range([<COMPARISON_VALUE> <COMPARISON_OPERATOR>] <FIELD> <COMPARISON_OPERATOR> <COMPARISON_VALUE>)

Description:

The Range query is overloaded to work on integer values as well as timestamp values. Returns bits that are true for the comparison operator.

Result Type: object with attrs and columns

Examples:

In our source data, commitactivity was counted over the last year. The following greater-than Range query returns all columns with a field value greater than 100 (repositories having more than 100 commits):

Range(commitactivity > 100)
{{"attrs":{},"columns":[10]}

BSI range queries support the following operators:

Operator Name Value
> greater-than, GT integer
< less-than, LT integer
<= less-than-or-equal-to, LTE integer
>= greater-than-or-equal-to, GTE integer
== equal-to, EQ integer
!= not-equal-to, NEQ integer or null

<, and <= can be chained together to represent a bounded interval. For example:

Range(50 < commitactivity < 150)
{{"attrs":{},"columns":[10]}

As of Pilosa 1.0, the “between” syntax Range(frame=stats, commitactivity >< [50, 150]) is no longer supported.

Min

Spec:

Min([ROW_CALL], field=<FIELD>)

Description:

Returns the minimum value of all BSI integer values in this field. If the optional Row call is supplied, only columns with set bits are considered, otherwise all columns are considered.

Result Type: object with the min and count of columns containing the min value.

Examples:

Query the minimum value of a field (minimum size of all repositories):

Min(field="diskusage")
{"value":4,"count":2}

Max

Spec:

Max([ROW_CALL], field=<FIELD>)

Description:

Returns the maximum value of all BSI integer values in this field. If the optional Row call is supplied, only columns with set bits are considered, otherwise all columns are considered.

Result Type: object with the max and count of columns containing the max value.

Examples:

Query the maximum value of a field (maximum size of all repositories):

Max(field="diskusage")
{"value":88,"count":13}

Sum

Spec:

Sum([ROW_CALL], field=<FIELD>)

Description:

Returns the count and computed sum of all BSI integer values in the field. If the optional Row call is supplied, columns with set bits are summed, otherwise the sum is across all columns.

Result Type: object with the computed sum and count of the values in the integer field.

Examples:

Query the size of all repositories.

Sum(field="diskusage")
{"value":10,"count":3}

View markdown source on Github. Last updated 4 months ago.