Hive
Hive ETL: Loading JSON, XML, Text Data Examples
Hive as an ETL and data warehousing tool on top of Hadoop ecosystem provides functionalities like...
Table Operations such as Creation, Altering, and Dropping tables in Hive can be observed in this tutorial.
In the Below screenshot, we are creating a table with columns and altering the table name.
1. Creating table guru_sample with two column names such as "empid" and "empname"
2. Displaying tables present in guru99 database
3. Guru_sample displaying under tables
4. Altering table "guru_sample" as "guru_sampleNew"
5. Again when you execute "show" command, it will display the new name Guru_sampleNew
Dropping table guru_sampleNew:
Coming to Tables it's just like the way that we create in traditional Relational Databases. The functionalities such as filtering, joins can be performed on the tables.
Hive deals with two types of table structures like Internal and External tables depending on the loading and design of schema in Hive.
Internal tables
When to Choose Internal Table:
Sample code Snippet for Internal Table
1. To create the internal table
Hive>CREATE TABLE guruhive_internaltable (id INT,Name STRING); Row format delimited Fields terminated by '\t';
2. Load the data into internal table
Hive>LOAD DATA INPATH '/user/guru99hive/data.txt' INTO table guruhive_internaltable;
3. Display the content of the table
Hive>select * from guruhive_internaltable;
4. To drop the internal table
Hive>DROP TABLE guruhive_internaltable;
If you dropped the guruhive_internaltable, including its metadata and its data will be deleted from Hive.
From the following screenshot, we can observe the output
In above code and from screen shot we do following things,
External tables
When to Choose External Table:
Sample code Snippet for External Table
1. Create External table
Hive>CREATE EXTERNAL TABLE guruhive_external(id INT,Name STRING) Row format delimited Fields terminated by '\t' LOCATION '/user/guru99hive/guruhive_external;
2. If we are not specifying the location at the time of table creation, we can load the data manually
Hive>LOAD DATA INPATH '/user/guru99hive/data.txt' INTO TABLE guruhive_external;
3. Display the content of the table
Hive>select * from guruhive_external;
4. To drop the internal table
Hive>DROP TABLE guruhive_external;
From the following screen shot, we can observe the output
In above code, we do following things
Difference between Internal Vs External tables
Feature | Internal | External |
---|---|---|
Schema | Data on Schema | Schema on Data |
Storage Location | /usr/hive/warehouse | HDFS location |
Data availability | Within local file system | Within HDFS |
Hive as an ETL and data warehousing tool on top of Hadoop ecosystem provides functionalities like...
What is HiveQL(Hive Query Language)? Hive provides a CLI to write Hive queries using Hive Query...
Hive provides SQL type querying language for the ETL purpose on top of Hadoop file system. Hive Query...
In this tutorial, you will learn- What is Hive? Hive Architecture Different modes of Hive What is...
What is a View? Views are similar to tables, which are generated based on the requirements. We can...
Functions are built for a specific purpose to perform operations like Mathematical, arithmetic,...