The main use cases for Spark are iterative Machine Learning algorithms and Interactive analytics. From the ML side -------------------- Most ML algorithms ru...
9781449327279 - Hadoop Operations - If you’ve been tasked with the job of maintaining large and complex Hadoop clusters, or are about to be, this book is a must. You’ll learn the particulars of Hadoop operations, from planning, installing, and configuring the system to providing ongoing maintenance.
Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
Event-Detection - DBSCAN Algorithm in Map/Reduce logic, implemented with Hadoop and MongoDB, to analyze tweets and photos and to create geolocated events
T. Sandholm, and K. Lai. SIGMETRICS '09: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, page 299--310. New York, NY, USA, ACM, (2009)
T. Elsayed, J. Lin, and D. Oard. Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers, page 265--268. Stroudsburg, PA, USA, Association for Computational Linguistics, (2008)