23 jan 26 -
Data Model (Step-by-Step)
Supermarket Sales Analytics – Requirement to Data Model (Step-by-Step)
STEP 1: Understand the Business Context
Business Overview
A supermarket chain operates:
-
Multiple stores across regions
-
Thousands of products
-
Sales staff earning commissions
-
Loyalty customers
Business Goals
-
Increase sales & profitability
-
Identify high-demand products
-
Improve regional performance
-
Track employee commission
-
Understand customer behavior
STEP 2: Requirement Gathering – How to Ask Questions
πΉ Core Requirement Questions (Very Important)
A. Business Objective
Ask:
-
What decisions will this report support?
-
Who will use this report (CEO, Manager, Ops)?
-
Is this for monitoring or deep analysis?
B. Sales-Related Questions
Ask:
-
Do you want daily, monthly, or yearly sales?
-
Should discounts be included in revenue?
-
Do you want gross margin or only sales?
C. Time & Comparison
Ask:
-
Do you need YoY / MoM comparison?
-
How many years of history?
D. Hierarchy & Drill-down
Ask:
-
Region → Store → City?
-
Category → Subcategory → Product?
E. Data Availability
Ask:
-
Is cost available for margin?
-
Is employee commission stored or calculated?
STEP 3: Identify Required Reports (5 Reports)
✅ Final Agreed Reports
-
Sales Performance Report
-
Customer Analysis Report
-
Regional Performance Report
-
Employee Commission Report
-
Product in Demand Report
STEP 4: Define KPIs for Each Report
π Report 1: Sales Performance
Business Questions
-
How much are we selling?
-
Are sales increasing?
KPIs
-
Total Sales Amount
-
Quantity Sold
-
Discount Amount
-
Gross Margin
-
Sales Growth %
π₯ Report 2: Customer Analysis
Business Questions
-
Who are our top customers?
-
Are customers returning?
KPIs
-
Total Spend per Customer
-
Purchase Frequency
-
Average Basket Value
-
Customer Lifetime Value (basic)
π Report 3: Regional Performance
Business Questions
-
Which regions perform best?
-
Where are we underperforming?
KPIs
-
Sales by Region
-
Sales Growth %
-
Contribution %
-
Store Count
π§πΌ Report 4: Employee Commission
Business Questions
-
How much commission is earned?
-
Who are top performers?
KPIs
-
Total Sales by Employee
-
Commission Amount
-
Commission %
-
Rank by Sales
π¦ Report 5: Product in Demand
Business Questions
-
What products sell the most?
-
Which products should be stocked more?
KPIs
-
Quantity Sold
-
Sales Amount
-
Sales Frequency
-
Stock Turnover (if inventory exists)
STEP 5: Define Grain, Granularity & Granules
πΉ Fact Table Grain (Very Important)
One row per Product per Store per Day per Employee
This supports all 5 reports.
πΉ Granularity (Analysis Levels)
| Dimension | Granularity Levels |
|---|---|
| Time | Day → Month → Year |
| Product | Product → SubCategory → Category |
| Geography | Store → City → Region |
| Customer | Customer |
| Employee | Employee |
πΉ Granules (Lowest Identifiers)
Granules are the keys that define the grain:
Each unique combination = 1 row
STEP 6: Choose the Data Model
✅ Recommended Model: Star Schema (with selective Snowflake)
Why?
-
BI friendly
-
Easy reporting
-
Snowflake only where hierarchy is complex (Product, Region)
STEP 7: Identify Fact Tables
πΉ Primary Fact Table
FACT_SALES
Measures:
-
Sales_Amount
-
Quantity
-
Discount
-
Cost
-
Commission_Amount
STEP 8: Identify Dimension Tables
πΉ Dimensions Used Across Reports
| Dimension | Purpose |
|---|---|
| DIM_DATE | Time analysis |
| DIM_PRODUCT | Product details |
| DIM_CATEGORY (Snowflake) | Product hierarchy |
| DIM_STORE | Store info |
| DIM_REGION (Snowflake) | Geography hierarchy |
| DIM_CUSTOMER | Customer analysis |
| DIM_EMPLOYEE | Commission tracking |
STEP 9: Data Model per Report (Mapping)
π Sales Performance
π₯ Customer Analysis
π Regional Performance
π§πΌ Employee Commission
π¦ Product in Demand
STEP 10: Data Modeling Steps (Execution)
-
Finalize grain
-
List KPIs & measures
-
Identify dimensions
-
Define hierarchies
-
Design star/snowflake schema
-
Assign surrogate keys
-
Validate against reports
-
Review with business
-
Freeze model
-
Build ETL
π§ Final Reality Check (Very Important)
Reports drive KPIs → KPIs drive grain → Grain drives data model
Never design tables first.
π€ Interview-Ready Closing Line
“I start with business questions, derive KPIs, define the grain, and then design a star or snowflake schema that supports all required reports with minimal redundancy.”
No comments:
Post a Comment