Accumulo

Apache Accumulo is a sorted, distributed key/value store and is at the core of Sqrrl Enterprise. It handles large amounts of structured, semi-structured, and unstructured data as a robust, scalable, and real-time data storage and retrieval system. Accumulo is inspired by Google’s BigTable paper.

SQR_Google_Accumulo
Originally developed by the NSA beginning in 2008, Accumulo is now an open source software project hosted by the Apache Foundation and natively integrates with Apache Hadoop. Accumulo is a low latency, non-relational database and uses Hadoop as its file system for storage.

Apache Accumulo has three unique technical advantages over other comparable non-relational or NoSQL database solutions:

  • Security: Fine-grained security controls allow organizations to control data at the cell-level without degrading performance.
  • Performance: Accumulo is proven to operate and perform at massive scale (i.e., tens of petabytes of data) with low administrative overhead. Accumulo also features very fast reads and writes (10,000s operations per second per node) to support interactive queries and high throughput.
  • Flexibility: Accumulo can easily handle multi-structured and sparse datasets without extensive data modeling.

Sqrrl Enterprise builds on these advantages and offers a number of additional features that make Accumulo easier-to-use, more powerful in terms of search and query capabilities, and even more secure.

  • Ease-of-Use:  Accumulo is a complex distributed database that can be complicated to setup and maintain.  Sqrrl simplifies the use of Accumulo with installation tools, data loading tools, and world-class support from the creators of Accumulo.
  • Search and Query:  Accumulo natively has a very low-level API that can be difficult to use for powering interactive apps.  Sqrrl improves this API by adding richer search and query capabilities, including full-text keyword search, document search (i.e., JSON document support), a SQL-like queries, and graph search.
  • Security:  Accumulo has the ability to store security tags inside each individual key/value pair and filter these labels at interactive speeds.  Sqrrl extends this cell-level security capability with a labeling engine that automates application of the security labels, a policy engine that supports Role-Based and Attribute-Based Access Controls, encryption-at-rest and encryption-in-motion, and auditing capabilities.