HBase
Top 30 Hbase Interview Questions & Answers
Download PDF Following are frequently asked questions in interviews for freshers as well...
After successful installation of HBase on top of Hadoop, we get an interactive shell to execute various commands and perform several operations. Using these commands, we can perform multiple operations on data-tables that can give better data storage efficiencies and flexible interaction by the client.
We can interact with HBase in two ways,
In HBase, interactive shell mode is used to interact with HBase for table operations, table management, and data modeling. By using Java API model, we can perform all type of table and data operations in HBase. We can interact with HBase using this both methods.
The only difference between these two is Java API use java code to connect with HBase and shell mode use shell commands to connect with HBase.
Quick overcap of HBase before we proceed-
For examples,
In this tutorial- you will learn,
In Hbase, general commands are categorized into following commands
To get enter into HBase shell command, first of all, we have to execute the code as mentioned below
hbase Shell
Once we get to enter into HBase shell, we can execute all shell commands mentioned below. With the help of these commands, we can perform all type of table operations in the HBase shell mode.
Let us look into all of these commands and their usage one by one with an example.
Syntax:status
This command will give details about the system status like a number of servers present in the cluster, active server count, and average load value. You can also pass any particular parameters depending on how detailed status you want to know about the system. The parameters can be 'summary', 'simple', or 'detailed', the default parameter provided is "summary".
Below we have shown how you can pass different parameters to the status command.
If we observe the below screen shot, we will get a better idea.
hbase(main):001:0>status hbase(main):002:0>status 'simple' hbase(main):003:0>status 'summary' hbase(main):004:0> status 'detailed'
When we execute this command status, it will give information about number of server's present, dead servers and average load of server, here in screenshot it shows the information like- 1 live server, 1 dead servers, and 7.0000 average load.
Syntax: version
Syntax:table_help
This command guides
Syntax:
Syntax: Whoami
This command "whoami" is used to return the current HBase user information from the HBase cluster.
It will provide information like
In HBase, Column families can be set to time values in seconds using TTL. HBase will automatically delete rows once the expiration time is reached. This attribute applies to all versions of a row – even the current version too.
The TTL time encoded in the HBase for the row is specified in UTC. This attribute used with table management commands.
Important differences between TTL handling and Column family TTLs are below
These commands will allow programmers to create tables and table schemas with rows and column families.
The following are Table Management commands
Let us look into various command usage in HBase with an example.
Syntax: create <tablename>, <columnfamilyname>
Example:-
hbase(main):001:0> create 'education' ,'guru99' 0 rows(s) in 0.312 seconds =>Hbase::Table – education
The above example explains how to create a table in HBase with the specified name given according to the dictionary or specifications as per column family. In addition to this we can also pass some table-scope attributes as well into it.
In order to check whether the table 'education' is created or not, we have to use the "list" command as mentioned below.
Syntax:list
Syntax:describe <table name>
hbase(main):010:0>describe 'education'
This command describes the named table.
Syntax: disable <tablename>
hbase(main):011:0>disable 'education'
Here, in the above screenshot we are disabling table education
Syntax: disable_all<"matching regex"
Syntax: enable <tablename>
hbase(main):012:0>enable 'education'
Syntax: show_filters
This command displays all the filters present in HBase like ColumnPrefix Filter, TimestampsFilter, PageFilter, FamilyFilter, etc.
Syntax:drop <table name>
hbase(main):017:0>drop 'education'
We have to observe below points for drop command
Syntax: drop_all<"regex">
Syntax: is_enabled 'education'
This command will verify whether the named table is enabled or not. Usually, there is a little confusion between "enable" and "is_enabled" command action, which we clear here
Syntax: alter <tablename>, NAME=><column familyname>, VERSIONS=>5
This command alters the column family schema. To understand what exactly it does, we have explained it here with an example.
Examples:
In these examples, we are going to perform alter command operations on tables and on its columns. We will perform operations like
hbase> alter 'education', NAME='guru99_1', VERSIONS=>5
hbase> alter 'edu', 'guru99_1', {NAME => 'guru99_2', IN_MEMORY => true}, {NAME => 'guru99_3', VERSIONS => 5}
Use one ofthese commands below,
hbase> alter 'education', NAME => 'f1', METHOD => 'delete'
hbase> alter 'education', 'delete' =>' guru99_1'
Syntax: alter <'tablename'>, MAX_FILESIZE=>'132545224'
Step 1) You can change table-scope attributes like MAX_FILESIZE, READONLY, MEMSTORE_FLUSHSIZE, DEFERRED_LOG_FLUSH, etc. These can be put at the end;for example, to change the max size of a region to 128MB or any other memory value we use this command.
Usage:
NOTE: MAX_FILESIZE Attribute Table scope will be determined by some attributes present in the HBase. MAX_FILESIZE also come under table scope attributes.
Step 2) You can also remove a table-scope attribute using table_att_unset method. If you see the command
alter 'education', METHOD => 'table_att_unset', NAME => 'MAX_FILESIZE'
Syntax: alter_status 'education'
These commands will work on the table related to data manipulations such as putting data into a table, retrieving data from a table and deleting schema, etc.
The commands come under these are
Let look into these commands usage with an example.
Syntax: count <'tablename'>, CACHE =>1000
Example:
hbase> count 'guru99', CACHE=>1000
This example count fetches 1000 rows at a time from "Guru99" table.
We can make cache to some lower value if the table consists of more rows.
But by default it will fetch one row at a time.
hbase>count 'guru99', INTERVAL => 100000 hbase> count 'guru99', INTERVAL =>10, CACHE=> 1000
If suppose if the table "Guru99" having some table reference like say g.
We can run the count command on table reference also like below
hbase>g.count INTERVAL=>100000 hbase>g.count INTERVAL=>10, CACHE=>1000
Syntax: put <'tablename'>,<'rowname'>,<'columnvalue'>,<'value'>
This command is used for following things
Example:
hbase> put 'guru99', 'r1', 'c1', 'value', 10
Suppose if the table "Guru99" having some table reference like say g. We can also run the command on table reference also like
hbase> g.put 'guru99', 'r1', 'c1', 'value', 10
The output will be as shown in the above screen shot after placing values into "guru99".
To check whether the input value is correctly inserted into the table, we use "scan" command. In the below screen shot, we can see the values are inserted correctly
Code Snippet: For Practice
create 'guru99', {NAME=>'Edu', VERSIONS=>213423443} put 'guru99', 'r1', 'Edu:c1', 'value', 10 put 'guru99', 'r1', 'Edu:c1', 'value', 15 put 'guru99', 'r1', 'Edu:c1', 'value', 30
From the code snippet, we are doing these things
Syntax: get <'tablename'>, <'rowname'>, {< Additional parameters>}
Here <Additional Parameters> include TIMERANGE, TIMESTAMP, VERSIONS and FILTERS.
By using this command, you will get a row or cell contents present in the table. In addition to that you can also add additional parameters to it like TIMESTAMP, TIMERANGE,VERSIONS, FILTERS, etc. to get a particular row or cell content.
Examples:-
hbase> get 'guru99', 'r1', {COLUMN => 'c1'}
For table "guru99' row r1 and column c1 values will display using this command as shown in the above screen shot
hbase> get 'guru99', 'r1'
For table "guru99"row r1 values will be displayed using this command
hbase> get 'guru99', 'r1', {TIMERANGE => [ts1, ts2]}
For table "guru99"row 1 values in the time range ts1 and ts2 will be displayed using this command
hbase> get 'guru99', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
For table "guru99" row r1 and column families' c1, c2, c3 values will be displayed using this command
Syntax:delete <'tablename'>,<'row name'>,<'column name'>
Example:
hbase(main):)020:0> delete 'guru99', 'r1', 'c1''.
Syntax: deleteall <'tablename'>, <'rowname'>
Example:-
hbase>deleteall 'guru99', 'r1', 'c1'
This will delete all the rows and columns present in the table. Optionally we can mention column names in that.
Syntax: truncate <tablename>
After truncate of an hbase table, the schema will present but not the records. This command performs 3 functions; those are listed below
Syntax: scan <'tablename'>, {Optional parameters}
This command scans entire table and displays the table contents.
scan 'guru99'
The output as below shown in screen shot
In the above screen shot
Examples:-
The different usages of scan command
Command | Usage |
scan '.META.', {COLUMNS => 'info:regioninfo'} | It display all the meta data information related to columns that are present in the tables in HBase |
scan 'guru99', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'} | It display contents of table guru99 with their column families c1 and c2 limiting the values to 10 |
scan 'guru99', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]} | It display contents of guru99 with its column name c1 with the values present in between the mentioned time range attribute value |
scan 'guru99', {RAW => true, VERSIONS =>10} | In this command RAW=> true provides advanced feature like to display all the cell values present in the table guru99 |
Code Example:
First create table and place values into table
create 'guru99', {NAME=>'e', VERSIONS=>2147483647} put 'guru99', 'r1', 'e:c1', 'value', 10 put 'guru99', 'r1', 'e:c1', 'value', 12 put 'guru99', 'r1', 'e:c1', 'value', 14 delete 'guru99', 'r1', 'e:c1', 11
Input Screenshot:
If we run scan command
Query: scan 'guru99', {RAW=>true, VERSIONS=>1000}
It will display output shown in below.
Output screen shot:
The output shown in above screen shot gives the following information
Command | Functionality |
add_peer | Add peers to cluster to replicate hbase> add_peer '3', zk1,zk2,zk3:2182:/hbase-prod |
remove_peer | Stops the defined replication stream. Deletes all the metadata information about the peer hbase> remove_peer '1' |
start_replication | Restarts all the replication features hbase> start_replication |
stop_replication | Stops all the replication features hbase>stop_replication |
Summary:
HBase shell and general commands give complete information about different type of data manipulation, table management, and cluster replication commands. We can perform various functions using these commands on tables present in HBase.
Download PDF Following are frequently asked questions in interviews for freshers as well...
HBase architecture always has " Single Point Of Failure " feature, and there is no exception...
In HBase, we can create table operations in two ways Shell command JAVA API We will learn to use...
In this tutorial, you will learn: Write Data to HBase Table: Shell Read Data from HBase Table:...
In this tutorial- you will learn, Apache HBase Installation Modes How to Download Hbase tar file...
What is HBase? HBase is an open-source, column-oriented distributed database system in a Hadoop ...