Teradata Tutorial: What is Teradata SQL? Database Architecture

What is Teradata?

Teradata is an open-source Database Management System for developing large-scale data warehousing applications. This tool provides support for multiple data warehouse operations simultaneously using the concept of parallelism. Teradata is a massively open processing system that supports Unix/Linux/Windows server platforms.

Teradata software is developed by Teradata Corporation, which is an American IT firm. It is a vendor of analytics data platforms, applications, and other related services. The firm develops a product to consolidate data from various sources and make the data available for analysis.

History of Teradata

Teradata was a division of NCR Corporation. It incorporated in 1979 but parted away from NCR in October 2007. Michael Koehler became the first CEO of Teradata.

Milestones of Teradata Corporation:

  • 1979 – Teradata was incorporated
  • 1984 – Release of first database computer DBC/1012
  • 1986 – Fortune magazine declared Teradata as ‘Product of the Year’
  • 1999 – Largest database built using Teradata with 130 Terabytes
  • 2002 – Teradata V2R5 version release with compression and Partition Primary
  • 2006 – Launch of Teradata Master Data Management solution
  • 2008 – Teradata 13.0 released with Active Data Warehousing
  • 2011 – Acquires Teradata Aster and plunges into the Advanced Analytics Space
  • 2012 – Teradata 14.0 introduced
  • 2014 – Teradata 15.0 introduced
  • 2015- Teradata Buys Apps Marketing Platform Appoxee
  • 2016- Terada join hands with Big data
  • 2017- Teradata Acquires San Diego’s StackIQ

Why Teradata?

  • Teradata offers a full suite of service which focuses on Data Warehousing
  • The system is built on open architecture. So whenever any faster devices are made available, it can be incorporated into the already build architecture.
  • Teradata supports 50+ petabytes of data.
  • Single operation view for a large Teradata multi-node system using Service Workstation
  • Compatible with wide range of BI tool to fetch data.
  • It can act as a single point of control for the DBA to manage the Database.
  • High performance, diverse queries, in-database analytics and sophisticated workload management
  • Teradata allows you to get the same data on multiple deployment options

Next in this Teradata tutorial, we will learn about features of Teradata.

Features of Teradata SQL

Teradata offers following powerful features:

  • Linear Scalability: Offers linear scalability when dealing with large volumes of data by adding nodes to increase the performance of the system.
  • Unlimited Parallelism: Teradata is based on MPP (Massively Parallel Processing Architecture). So, it is designed to be parallel since the beginning. It can divide a large task into smaller tasks and run them in parallel
  • Mature Optimizer: Teradata Optimizer can handle up to 64 joins in a query.
  • Low TCO: Tera data has a low total cost of ownership. It is easy to setup, maintain, and administrate.
  • Load & Unload utilities: Teradata provides load & unload utilities to move data into/from Teradata System.
  • Connectivity: This MPP system can connect to channel-attached systems like a mainframe or network-attached systems.
  • SQL: Teradata supports SQL to interact with the data stored in tables. It provides its extension.
  • Robust Utilities: Teradata provides robust utilities to import/export data from/to Teradata systems like FastExport, FastLoad, MultiLoad, and TPT.
  • Automatic Distribution: Teradata can distribute the data to the disks automatically with no manual intervention.

Next in this Teradata SQL tutorial, we will learn about Teradata Architecture.

Teradata Architecture

Teradata architecture is a Massively Parallel Processing Architecture.

Three important components of Teradata are:

  • Parsing Engine
  • BYNET
  • Access Module Processors (AMPs)

Teradata Storage Architecture Database Architecture Diagram:

Teradata Architecture
Teradata Architecture Diagram

Teradata Storage Architecture

Parsing Engine:

The Parsing Engine parses the queries and prepares the execution plan. It manages sessions for users. It optimizes & sends a request to the users.

So, when the client executes queries for inserting records, Parsing Engine sends the records to the Message Passing layer. Message passing layer or BYNET is a software and hardware component. It offers networking capability. It also retrieves the records and sends the row to the target AMP.

AMP:

AMP stands for Access Module Processor. It stores records on these disks. AMP conduct following activities:

  • Manages a portion of the database
  • Manages a portion of each table
  • Perform all the task associated with generating result set such as sort, aggregation and join
  • Perform lock and Space management

Teradata Retrieval Architecture

When the client runs queries to retrieve records, the Parsing engine sends a request to BYNET. Then BYNET sends the retrieval request to appropriate AMPs.

AMPs search their disks in parallel and recognize the required records and send them to BYNET. BYNET sends the records to Parsing Engine, which in turn will be send to the client.

Next in this Teradata Database tutorial, we will learn about Teradata SQL commands.

Types of Teradata SQL commands

Teradata Database supports following basic SQL commands:

  1. Data Definition Language (DDL) commands
  2. Data Control Language (DCL) commands
  3. Data Manipulation Language (DML) commands

Data Definition Language commands

COMMAND Description
CREATE Creates a new database, table, user, etc.
DROP Removes a new database, table, user, etc.
ALTER Changes a Table, column, trigger, etc.
MODIFY Changes a Database or user definition
RENAME Changes name of tables, views, macros, etc.

Data Control Language commands

COMMAND Description
GRANT/REVOKE Used to control privileges of a user on an object
GRANT LOGON/REVOKE LOGON Used to control logon privileges to a host or host group
GIVE Used to give a database object to another database object

Teradata Database SQL Data Manipulation Language commands

COMMAND Description
DELETE Removes a row from table
ECHO Used to echo a string or command to the client
CHECKPOINT Defines a recovery point in the journal that can be used later to restore the table content
SELECT Used to return a specific row data in a table form
UPDATE Modifies data in one or more rows of a table

Applications of Teradata Database

Following are the popular Teradata Applications:

  • Customer Data Management: Helps to maintain long-lasting relationships with customers.
  • Master Data Management: Helps to develop an environment where master data can be used, synchronized, and stored.
  • Finance and Performance Management: Helps organization to improve the speed and quality of financial reporting. It reduces finance infrastructure costs, and proactively manage enterprise performance.
  • Supply Chain Management: Improve supply chain operations which help to improved customer service, reduced cycle times, and lower inventories.
  • Demand Chain Management: Helps to Increase customer service levels and sales. It also helps companies to predict the demand for their store item accurately.

Next in this Teradata for beginners tutorial, we will learn about the difference between Teradata and other RDBMS.

Difference between Teradata and other RDBMS

Parameter TERA DATA RDBMS
Architectures Follows Shared Nothing Architecture. Shared Everything and allows resource contention.
Processes MIPS [Millions of Instructions/sec] KIPS [Thousands of institutions/sec]
Indexes Better Distribution and Retrieval Only offers FASI Retrieval
Parallelism Supports Un-conditional parallelism. Parallelism is conditional and unpredictable
Bulk Load Teradata allows bulk load. Allows only limited bulk load.
Scalability Linear scalability with a slope of one Scalability with diminishing returns
Database buffer A single database buffer used by all UoP’s. (A unite of parallelism). A single data store accessed by all UoP’s Query Controller ships functions to UoP’s that own the data
Stores It stores TERA BYTES[Billions of rows] GIGA BYTES[Millions of rows]

MPP vs. SMP

MPP SMP
MPP – Massively Parallel Processing. It is Computer system which is attached to many independent arithmetic units or entire microprocessors, that run in parallel. Symmetric Multi-Processing. In an SMP processing system, the CPU’s share the same memory, and as a result code running in one system may affect the memory used by another.
Databases can expand by adding new CPUs. SMP databases generally use one CPU to perform database searches.
In an MPP environment, performance is improved because no resources must be shared among physical computers. The workload for a parallel job is distributed across the processors in the system.
Performance of a Massive parallel processing system is linear. However, it will increase in proportion to the number of nodes. SMP databases can run on multiple servers. However, will share another resource.

Summary

  • Teradata meaning: Teradata is an open-source Database Management System for developing large-scale data warehousing applications.
  • Teradata was a division of NCR Corporation. It was incorporated in 1979 but parted away from NCR in October 2007
  • Teradata offers a full suite of service which focuses on Data Warehousing
  • Teradata offers linear scalability when dealing with large volumes of data by adding nodes to increase the performance of the system.
  • Three important components of Teradata are 1) Parsing Engine 2) MPP 3) Access Module Processors (AMPs)
  • Teradata offers a complete range of product suite to meet Data warehousing and ETL needs of any organization
  • Teradata application mainly used for Supply Chain Management, Master Data Management, Demand Chain Management, etc.