PDK

The Pilosa Dev Kit contains executables, examples, and Go libraries to help you use Pilosa effectively.

Examples and Executables

Running pdk -h will give the most up to date list of all the tools and examples that PDK provides. We’ll cover a few of the more important ones here.

Kafka

pdk kafka reads either JSON or Avro encoded records from Kafka (using the Confluent Schema Registry in the case of Avro), and indexes them in Pilosa. Each record from Kafka is assigned a Pilosa column, and each value in a record is assigned a row or field. Pilosa field names are built from the “path” through the record to arrive at that field. For example:

{
  "name": "jill",
  "favorite_foods": ["corn chips", "chipotle dip"],
  "location": {
    "city": "Austin",
    "state": "Texas",
    "latitude": 3754,
    "longitude": 4526
  },
  "active": true,
  "age": 27
}

This JSON object would result in the following Pilosa schema:

Field Type Min Max Size
name ranked 100000
favorite_foods ranked 100000
default ranked 100000
age int 0 2147483647
location ranked 1000
latitude int 0 2147483647
longitude int 0 2147483647
location-city ranked 100000
location-state ranked 100000

All set fields are created as ranked fields by default, with the cache size listed above. Integer fields are created with a minimum size of zero and a fixed maximum of 2147483647. Field names are a dash-separated concatenation of all key values in the path - you can see this with fields like location-city.

Most of the options to pdk kafka are self-explanatory (kafka hosts, pilosa hosts, kafka topics, kafka group, etc.), but there are a few options that give some control over the way data is indexed, and ingestion performance.

Library

For now, the Godocs have the most up to date library documentation.


View markdown source on Github. Last updated 3 months ago.