When to use star schema and when to go for snowflake ?
If you have a few dimensions and low cardinality (less unique values in dims ) , but you require fast query execution, a star schema is the right choice.
However, if you have several dimensions and high cardinality, a snowflake schema will be a better scheme
.NO |
Star Schema |
Snowflake Schema |
1. |
In star schema, The fact tables and the dimension tables
are contained. |
While in snowflake schema, The fact tables, dimension tables
as well as sub dimension tables are contained. |
2. |
Star schema is a top-down model. |
While it is a bottom-up model. |
3. |
Star schema uses more space. |
While it uses less space. |
4. |
It takes less time for the execution of queries. |
While it takes more time than star schema for the
execution of queries. |
5. |
In star schema, Normalization is not used. |
While in this, Both normalization and
denormalization are used. |
6. |
It’s design is very simple. |
While it’s design is complex. |
7. |
The query complexity of star schema is low. |
While the query complexity of snowflake schema is
higher than star schema. |
8. |
It’s understanding is very simple. |
While it’s understanding is difficult. |
9. |
It has less number of foreign keys. |
While it has more number of foreign keys. |
10. |
It has high data redundancy. |
While it has low data redundancy. |
Basis of Distinction | Star Schema | Snowflake Schema |
Definition and Meaning | Both fact tables and dimension tables are present in a star schema. | Dimension tables, sub-dimension tables, and fact tables are all included in a snowflake schema. |
Type of Model | The star schema is a top-down type of model. | The snowflake schema is a bottom-up type of model. |
Space | Star schema uses more space compared to Snowflake Schema. | Snowflake schema uses less space comparatively. |
Joint Relations | In a star schema, relationships between tables are represented by a single join, resulting in a simple data structure for fast query performance and easy data analysis. | The snowflake schema has a complex data structure with multiple levels of relationships between tables, represented by multiple joins. This can make the data structure more difficult to understand and result in slower query performance. |
Response Time for Queries | Star schemas have faster query execution times due to a single join of a fact table and its attributes in dimensional tables. | Snowflake schemas require complex joins between tables, which can slow down query processing and impact other OLAP products. |
Normalization | In a star schema, dimension tables are not organized in a normalized form. They are typically denormalized and contain multiple levels of information about a particular subject in a single table. | Dimension tables in snowflake schema are normalized. |
Design Complexity | Has a simpler design compared to snowflake schema. | More complex design compared to star schema. |
Query Complexity | Star schemas have simpler query design due to the fact the table is joined to only one level of dimensional tables. | Snowflake schemas, on another hand, have a more complex query design due to the need for multiple joins between the fact table and its dimensional tables. This leads to additional overhead in query writing. |
Understanding Complexity | It is simpler to understand compared to snowflake schema. | More complex to understand compared to star schema. |
Foreign Keys | Have a lesser number of foreign keys. | Comparatively has more foreign keys. |
Data Redundancy | The star schema stores redundant data in the dimension tables. | The snowflake design fully normalizes the dimension tables and prevents data redundancy, |
Advantages |
|
|
Disadvantages |
|
|
Advantages and Disadvantages of Star Schema
Star schema suits applications requiring simplicity and performance as it has fewer tables and joins.
The main demerit of the star schema is its denormalization and redundancy. This can lead to wide dimension tables that take up more storage space and rake up the risk of data inconsistency.
As the schema flattens the dimension hierarchies into a single table, their depth and breadth gets limited. This makes the star schema unsuitable for data warehouses having complex and dynamic dimensions that require accuracy and high data normalization.
Which schema is right for you?
The answer depends on your specific needs and requirements. If you’re looking for a simple, efficient cloud data warehouse solution, a star schema might be the best option. But if you need more flexibility to accommodate changing data requirements, a snowflake schema may be a better choice. No matter which schema you choose, ThoughtSpot can help you get the most out of your data. Most BI tools require a specific schema design to be used, ThoughtSpot has no such restrictions.
Star Schema | Snowflake Schema |
---|---|
Hierarchies for the dimensions are stored in the dimensional table. | Hierarchies are divided into separate tables. |
It contains a fact table surrounded by dimension tables. | One fact table surrounded by dimension table which are in turn surrounded by dimension table |
In a star schema, only single join creates the relationship between the fact table and any dimension tables. | A snowflake schema requires many joins to fetch the data. |
Simple DB Design. | Very Complex DB Design. |
Denormalized Data structure and query also run faster. | Normalized Data Structure. |
High level of Data redundancy | Very low-level data redundancy |
Single Dimension table contains aggregated data. | Data Split into different Dimension Tables. |
Cube processing is faster. | Cube processing might be slow because of the complex join. |
Offers higher performing queries using Star Join Query Optimization. Tables may be connected with multiple dimensions. | The Snowflake schema is represented by centralized fact table which unlikely connected with multiple dimensions. |
No comments:
Post a Comment