Katta - Lucene & more  in the cloud.

Katta is a scalable, failure tolerant, distributed, data storage for real time access.
Katta serves large, replicated, indices as shards to serve high loads and very large data sets. These indices can be of different type. Currently implementations are available for Lucene and Hadoop mapfiles.

  • Makes serving large or high load indices easy
  • Serves very large Lucene or Hadoop Mapfile indices as index shards on many servers
  • Replicate shards on different servers for performance and fault-tolerance
  • Supports pluggable network topologies
  • Master fail-over
  • Fast, lightweight, easy to integrate
  • Plays well with Hadoop clusters
  • Apache Version 2 License

For more information read our About Katta Page, try the Katta Quick Start, browse the Katta Documentation or subscribe to the Katta Mailing List (archive).
There is also a #katta channel @ irc.freenode.net. Please use #katta in your twitter messages. :)

Ted Dunning:
“We are using a variant of Katta as the basis of some of our search at
Deepdyve. Very, very nice. …
Much of the simplicity and reliability of Katta is directly inherited from
the simple programming model and solid implementation of Zookeeper.”