Snowflake Schema in Data Warehouse Model
What is a Snowflake Schema?
Snowflake Schema in data warehouse is a logical arrangement of tables in a multidimensional database such that the ER diagram resembles a snowflake shape. A Snowflake Schema is an extension of a Star Schema, and it adds additional dimensions. The dimension tables are normalized which splits data into additional tables.
Snowflake Schema Example
In the following Snowflake Schema example, Country is further normalized into an individual table.
Characteristics of Snowflake Schema
- The main benefit of the snowflake schema it uses smaller disk space.
- Easier to implement a dimension is added to the Schema
- Due to multiple tables query performance is reduced
- The primary challenge that you will face while using the snowflake Schema is that you need to perform more maintenance efforts because of the more lookup tables.
Advantage of Snowflake Schema
- Snowflake schema’s primary advantage is its ability to reduce disk storage requirements and join smaller lookup tables, improving query performance.
- Provides greater scalability in the interrelationship between components and dimension levels.
- There is no redundancy, so it is easier to maintain.
Disadvantage of Snowflake Schema
- A significant disadvantage of the snowflake schema is the increased maintenance required.
- Complex queries are challenging to understand.
- A larger number of tables means more joins, so a longer query execution time.
What is a Galaxy Schema?
A Galaxy Schema contains two fact table that share dimension tables between them. It is also called Fact Constellation Schema. The schema is viewed as a collection of stars hence the name Galaxy Schema.
As you can see in above example, there are two facts table
In Galaxy schema shares dimensions are called Conformed Dimensions.
Characteristics of Galaxy Schema
- The dimensions in this schema are separated into separate dimensions based on the various levels of hierarchy.
- For example, if geography has four levels of hierarchy like region, country, state, and city then Galaxy schema should have four dimensions.
- Moreover, it is possible to build this type of schema by splitting the one-star schema into more Star schemes.
- The dimensions are large in this schema which is needed to build based on the levels of hierarchy.
- This schema is helpful for aggregating fact tables for better understanding.
What is Star Cluster Schema?
Snowflake schema contains fully expanded hierarchies. However, this can add complexity to the Schema and requires extra joins. On the other hand, star schema contains fully collapsed hierarchies, which may lead to redundancy. So, the best solution may be a balance between these two schemas which is Star Cluster Schema design.
Overlapping dimensions can be found as forks in hierarchies. A fork happens when an entity acts as a parent in two different dimensional hierarchies. Fork entities then identified as classification with one-to-many relationships.