Jul 15 2008

Apache Hadoop & HBase

Category: 技术ssmax @ 13:08:51

忙完这一轮一下步就要看这个了。。。分布式文件系统

看说明是仿照Google的GFS和BigTable设计的,基本别人的功能它都有,难道我们要自己写一个出来???。。。

What Is Hadoop?

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing, including:

  • Hadoop Core, our flagship sub-project, provides a distributed filesystem (HDFS) and support for the MapReduce distributed computing metaphor.
  • HBase builds on Hadoop Core to provide a scalable, distributed database.

HBase is the Hadoop database. Its an open-source, distributed, column-oriented store modeled after the Google paper, Bigtable: A Distributed Storeage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop.

HBase’s goal is the hosting of very large tables — billions of rows X millions of columns — atop clusters of commodity hardward. Try it if your plans for a data store run to big.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.