HBase: Bigtable-like structured storage for Hadoop HDFS Just as Google's [WWW] Bigtable leverages the distributed data storage provided by the [WWW] Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Core. Data is organized into tables, rows and columns. An Iterator-like interface is available for scanning through a row range (and of course there is the ability to retrieve a column value for a specific key). Any particular column may have multiple versions for the same row key.
Apache's Hadoop project aims to solve these problems by providing a framework for running large data processing applications on clusters of commodity hardware. Combined with Amazon EC2 for running the application, and Amazon S3 for storing the data, we can run large jobs very economically. This paper describes how to use Amazon Web Services and Hadoop to run an ad hoc analysis on a large collection of web access logs that otherwise would have cost a prohibitive amount in either time or money.
M. Becker, H. Mewes, A. Hotho, D. Dimitrov, F. Lemmerich, и M. Strohmaier. International Conference Companion on World Wide Web, стр. 17--18. Republic and Canton of Geneva, Switzerland, International World Wide Web Conferences Steering Committee, (2016)
J. Dean, и S. Ghemawat. Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6, стр. 137--149. Berkeley, CA, USA, USENIX Association, (2004)
H. chih Yang, A. Dasdan, R. Hsiao, и D. Parker. SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, стр. 1029--1040. New York, NY, USA, ACM, (2007)
H. chih Yang, A. Dasdan, R. Hsiao, и D. Parker. SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, стр. 1029--1040. New York, NY, USA, ACM, (2007)
J. Dean, и S. Ghemawat. OSDI'04: Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, стр. 10--10. Berkeley, CA, USA, USENIX Association, (2004)