Apache HBase Definition And Introduction
Apache HBase Definition And Introduction
HBase is an open-source, distributed, versioned and non-relational database modeled after Google’s Bigtable and written in Java. It is developed as part of Apache Software Foundation’s Apache Hadoop project which runs on top of HDFS or Alluxio, providing Bigtable-like capabilities for Hadoop.
Apache HBase is designed for the massive scalability, so you can store the unlimited amounts of data in a single platform and handle growing demands for serving the data. HBase features the compression, in-memory operation and Bloom filters on a per-column basis as outlined in the original Bigtable paper.
HBase is designed to support the high table-update rates and to scale out horizontally in distributed compute clusters. It will focus on scale enables it to support the very large database tables. Tables in the HBase can serve as the input and output for MapReduce jobs that run in Hadoop, and may be accessed through the Java API but also through REST, Avro or Thrift gateway APIs.
HBase is well known for providing the strong data consistency on reads and writes, which distinguishes it from the other NoSQL databases. Mostly like Hadoop, an important aspect of the HBase architecture, it is the use of master nodes to manage region servers which distribute and process parts of data tables.
HBase is not a direct replacement for a classic SQL database. The Apache Phoenix project provides a SQL layer for HBase ad also JDBC driver that can be integrated with various analytics and business intelligence applications.
HBase is now serving many data-driven websites, but Facebook’s Messaging Platform recently migrated from HBase to MyRocks. Unlike the relational and the traditional databases, HBase does not support SQL scripting; but it is the equivalent is written in the Java, employing similarity with a MapReduce application.
HBase is a part of long list of Apache Hadoop add-ons which includes tools such as Hive, Pig and ZooKeeper. Similar to Hadoop, HBase is typically programmed using Java, not the SQL. As an open source project, it’s development is managed by the Apache Software Foundation and it became a top-level Apache project in 2010.
Other Courses :
SAP SuccessFactors Online Training