• Home
  • Testing
  • SAP
  • Web
  • Must Learn!
  • Big Data
  • Live Projects
  • AI
  • Blog

Hive is an open source data warehouse which is initially developed by Facebook for analysis and querying datasets but is now under Apache software foundation.

Hive is developed on top of Hadoop as its data warehouse framework for querying and analysis of data is stored in HDFS.

Hive is useful for performing operations like data encapsulation, ad-hoc queries, & analysis of huge datasets. Hive's design reflects its targeted use as a system for managing and querying structured data.

HBase Vs Hive

Features HBase Hive
Data base model Wide Column store Relational DBMS
Data Schema Schema- free With Schema
SQL Support No Yes it uses HQL(Hive query language)
Partition methods Sharding Sharding
Consistency Level Immediate Consistency Eventual Consistency
Secondary indexes No Yes
Replication Methods Selectable replication factor Selectable replication factor


While comparing HBase with Traditional Relational databases, we have to take three key areas into consideration. Those are data model, data storage, and data diversity.

  • Schema-less in database
  • Having fixed schema in database
  • Column-oriented databases
  • Row oriented data store
  • Designed to store De-normalized data
  • Designed to store Normalized data
  • Wide and sparsely populated tables present in HBase
  • Contains thin tables in database
  • Supports automatic partitioning
  • Has no built in support for partitioning
  • Well suited for OLAP systems
  • Well suited for OLTP systems
  • Read only relevant data from database
  • Retrieve one row at a time and hence could read unnecessary data if only some of the data in a row is required
  • Structured and semi-structure data can be stored and processed using HBase
  • Structured data can be stored and processed using RDBMS
  • Enables aggregation over many rows and columns
  • Aggregation is an expensive operation