Katta is a scalable, failure tolerant, distributed, data storage for real time access.
Katta serves large, replicated, indices as shards to serve high loads and very large data sets. These indices can be of different type. Currently implementations are available for Lucene and Hadoop mapfiles.
* Makes serving large or high load indices easy
* Serves very large Lucene or Hadoop Mapfile indices as index shards on many servers
* Replicate shards on different servers for performance and fault-tolerance
* Supports pluggable network topologies
* Master fail-over
* Fast, lightweight, easy to integrate
* Plays well with Hadoop clusters
* Apache Version 2 License
Introduction
On several occasions developing database-driven web applications, I've been approached by clients who want Google-style search implemented at the last minute of the development cycle. Usually this leads to using some canned script that crawls the website, or a hacked up search function that uses the database but either returns too many results or none at all. On top of that, the queries performed are too many or too slow.
Until now, most developers have been forced to use relational databases to power search, install extra component packages, or seek out other non-php solutions. The problem with using a relational database, such as MySql's fulltext indexing, is that scalability problems crop up as your search criteria becomes more complicated.
One of the features that sets the Zend Framework apart from the others is the inclusion of a decent search module. Zend_Search_Lucene is a php port of the Apache Lucene project, a full-text search engine framework. Zend_Search_Lucene promises a simple way to add search functionality to an application without requiring additional php extensions or even a database.
Zend_Search_Lucene overcomes the usual limitations of relational databases with features such as fast indexing, ranked result sets, a powerful but simple query syntax, and the ability to index multiple fields. Better still, a Zend_Search_Lucene index can live happily alongside your relational database to provide fast searching but without duplicating the effort of storing all of your data twice. In this tutorial, I'll show you how to use Zend_Search_Lucene to index and search some RSS feeds.