What is Star Schema in Data Warehouse modeling?
What is a Star Schema?
Star Schema in data warehouse, is a schema in which the center of the star can have one fact table and a number of associated dimension tables. It is known as star schema as its structure resembles a star. The Star Schema data model is the simplest type of Data Warehouse schema. It is also known as Star Join Schema and is optimized for querying large data sets.
What is Multidimensional schema?
Multidimensional Schema is especially designed to model data warehouse systems. The schemas are designed to address the unique needs of very large databases designed for the analytical purpose (OLAP).
Types of Data Warehouse Schema:
Following are 3 chief types of multidimensional schemas each having its unique advantages.
- Star Schema
- Snowflake Schema
- Galaxy Schema
Example of Star Schema Data Modelling
In the following Star Schema example, the fact table is at the center which contains keys to every dimension table like Dealer_ID, Model ID, Date_ID, Product_ID, Branch_ID & other attributes like Units sold and revenue.
A Fact table in a star schema contains facts and is connected to dimensions. A fact table has two types of columns:
- A column that includes Facts
- Foreign Key to Dimensions Table
Generally, the primary key of a fact table is a composite key that is made up of all the foreign keys that make up the table.
Fact tables can contain detail-level facts or aggregated facts. Fact tables that include aggregated facts are often called summary tables. Fact tables usually contain facts that have been aggregated to some level.
A dimension is an architecture that categorizes data in a hierarchy. A dimension without hierarchies and levels is called a flat dimension or list. Each dimension table’s primary key is part of the composite primary key of the fact table. A dimension attribute is a descriptive, textual attribute that helps describe a dimensional value. Fact tables are usually larger than dimension tables.
Characteristics of Star Schema
- Every dimension in a star schema is represented with the only one-dimension table.
- The dimension table should contain the set of attributes.
- The dimension table is joined to the fact table using a foreign key
- The dimension table are not joined to each other
- Fact table would contain key and measure
- The Star schema is easy to understand and provides optimal disk usage.
- The dimension tables are not normalized. For instance, in the above figure, Country_ID does not have Country lookup table as an OLTP design would have.
- The schema is widely supported by BI Tools
Advantages of Star Schema
- Star schemas have a more straightforward join logic compared to other schemas for fetching data from highly normalized transactional schemas.
- As opposed to highly normalized transactional schemas, the star schema simplifies common business reporting logic, such as reporting and period-over-period.
- Star schemas are widely used by OLAP systems to design cubes efficiently. A star schema can be used as a source without designing a cube structure in most major OLAP systems.
- By enabling specific performance schemes that can be applied to queries, the query processor software in Star Schema can offer better execution plans.
Disadvantage of Star Schema
- Since the schema is highly de-normalized, data integrity is not enforced well.
- Not flexible in terms of analytical needs.
- Star schemas do not reinforce many-to-many relationships within business entities.