Spinn3r is a web service that provides raw access to posts, articles, tweets, status updates, etc. being published - in real or near real time, allowing you to focus on building your application, mashup, or search engine. We find the sources, index their content and take care of all the heavy lifting around delivering large amounts of relevant data.
Peregrine is a map reduce framework designed for running iterative jobs across partitions of data. Peregrine is designed to be FAST for executing map reduce jobs by supporting a number of optimizations and features not present in other map reduce frameworks.
At Qubole, we are developing the next-generation cloud data platform for analyzing and processing data sets. Having conceived and built the "big data" platform at Facebook, we are bringing our learnings from that experience to provide similar infrastructure and tools on the cloud. Our service is targeted towards data analysts, data scientists and ETL engineers. We are looking for more companies to join our Early Access Program.
PigUnit is a simple xUnit framework that enables you to easily test your Pig scripts. With PigUnit you can perform unit testing, regression testing, and rapid prototyping. No cluster set up is required if you run Pig in local mode.