groonga - An open-source fulltext search engine and column store.

1. The characteristics of groonga

1.1. The successor to the Senna that is an embeddable full text search engine

Groonga is developed as the successor project to Senna. Senna is an embeddable full text search engine and it is used widely. Groonga takes over from Senna's characteristics that are fast, high precision and high flexibility. We started to develop groonga to improve those Senna' characteristics.

1.2. Groonga sever supports multi-protocols such as HTTP

Senna is an component for an application that supports full text search. Groonga can be a server that provides search service. The groonga server supports HTTP, memcached binary protocol and gqtp (groonga query transfer protocol). Clients can search by those protocols via TCP/IP connection. It makes easy to use on a rental server that can install a library.

Groonga also can be used as a C library like Senna.

1.3. Fast data update

Senna that is predecessor of groonga is a full text search engine without storage. Senna was commonly used with MySQL or PostgreSQL. Tritonn is a custom MyISAM storage engine that uses Senna as full text search engine. Ludia is a extension module for a PostgreSQL to use Senna as full text search engine. But those approaches can't fully utilize the performance characteristics of Senna. Senna can update index without read lock.

For example, MyISAM acquires table lock while updating records in many cases. In those cases, data update by MyISAM is a bottleneck however Senna updates an index full text search fast.

Groonga implements a storage that doesn't acquires read lock to realize search service that has immediacy.

1.4. Storage that can be shared with multi-processes and multi-threads

Groonga's storage file can be shared with multi-processes and multi-threads. It doesn't require explicit lock.

Groonga storage engine that is the successor to the Tritonn is implemented as a MySQL pruggable storage engine. Groonga storage files that are opened by groonga storage engine can also be shared with groonga server. For example, you can update your data via SQL and search your data via HTTP.

1.5. Fast aggregate query processing such as drilldown

Groonga's storage uses column oriented database model that stores data for each column. Column oriented database is suitable for fast aggregate query processing such as OLAP.

"Drilldown" is a processing that groups full text search result by each specified column values and counts number of records in each group. Groonga does the processing fast because groonga uses column oriented database.

1.6. Improved Senna's inverted index implementation

Groonga's inverted index is the improved Senna's inverted index. It is more faster and more versatile.

Groonga can also process some complex queries fast by utilizing inverted index. Those queries are difficult to process with SQL and RDB. For example, tag search and drilldown can be processed fast by utilizing inverted index.

1.8. Auto query cache mechanism

Groonga caches reference queries automatically.