Pilosa home splash illustration

Insanely Fast Queries on Really Big Data

Pilosa is an open source, distributed bitmap index that dramatically accelerates queries across multiple, massive data sets.

Install Pilosa
Pilosa home splash illustration

Accelerate your queries.

Over the past two decades, data volume has grown exponentially, yet query time has remained flat. Scientists and engineers should be able to ask questions of data in real-time, no matter how large the job.

Pilosa abstracts the index from data storage and optimizes it for massive scale.

Concepts illustration

Rapid Accessibility

Pilosa makes your data instantly accessible through an intuitive query language. No longer is your data locked in a data lake or fragmented across multiple data sources.

Distributed

Pilosa is horizontally scalable and automatically shards your data. Queries are executed in parallel across all Pilosa instances.

Algorithms

Pilosa's structure allows for easy algorithmic plugins thereby paving the way for faster data science at scale without sampling.

High Cardinality

Pilosa is built for massive volume. It indexes up to 2^64 objects and is practically unbounded in the number of attributes.

Segmentation

Pilosa can quickly filter giant datasets using arbitrarily complex nested boolean operations.

Use Cases

See how Pilosa solves existing problems by executing ad-hoc queries in milliseconds within high-cardinality datasets and across multiple data stores.

Chemical similarity use case abstract icon

Chemical Similarity and the Tanimoto Algorithm

Bioinformatics

Chemical similarity is essential to pharmaceutical development. Running Tanimoto algorithms over Pilosa clusters allows researchers to conduct exhaustive searches of existing structures to identify target chemicals thereby accelerating drug development.

Transportation data use case abstract icon

Taming Transportation Data

Smart Cities

Transportation systems are vital to economic networks but produce massive amounts of data that are difficult to analyze. Using New York City taxi data, we harnessed Pilosa's ability to work across datastores while supporting granular attributes.

Audience segmentation use case abstract icon

Understanding Fans at Umbel

Audience Segmentation

Umbel is where Pilosa’s journey began. See how the Umbel platform uses Pilosa to create highly-specific customer segments, allowing clients to personalize their messaging and increase revenue with data-driven, targeted campaigns.

Network traffic use case abstract icon

Monitoring Network Traffic

Network Security

Modern network attacks require an increasingly complex infrastructure of intrusion prevention, creating vast datastores that continue to grow. Layering Pilosa atop existing security solutions allows us to analyze high-volume network data and even predict network intrusions.

Sloth hand holding conference badge

Say hello at Gophercon!

Pilosa was written in Go. Of course we will be at Gophercon.

More events

Get even more Pilosa.

Stay updated on our progress and upcoming events.

       
Our story illustration

Our story

Pilosa was founded in 2017 with a commitment to building community-driven, open source software that unlocks the full power of data science. The software grew out of Umbel, a data management platform for sports & entertainment companies, where the engineering team was challenged to create a scalable way to run ad hoc queries to group and sort massive volumes of data with efficiency and reliability.

More about us