Friday, January 23, 2026

23 jan 26 - Data Modeling - ALL STEPS - With KPIS and req gathering

23 jan 26 -  

Data Model (Step-by-Step)


 Supermarket Sales Analytics – Requirement to Data Model (Step-by-Step)


STEP 1: Understand the Business Context

Business Overview

A supermarket chain operates:

  • Multiple stores across regions

  • Thousands of products

  • Sales staff earning commissions

  • Loyalty customers

Business Goals

  • Increase sales & profitability

  • Identify high-demand products

  • Improve regional performance

  • Track employee commission

  • Understand customer behavior


STEP 2: Requirement Gathering – How to Ask Questions

πŸ”Ή Core Requirement Questions (Very Important)

A. Business Objective

Ask:

  • What decisions will this report support?

  • Who will use this report (CEO, Manager, Ops)?

  • Is this for monitoring or deep analysis?


B. Sales-Related Questions

Ask:

  • Do you want daily, monthly, or yearly sales?

  • Should discounts be included in revenue?

  • Do you want gross margin or only sales?


C. Time & Comparison

Ask:

  • Do you need YoY / MoM comparison?

  • How many years of history?


D. Hierarchy & Drill-down

Ask:

  • Region → Store → City?

  • Category → Subcategory → Product?


E. Data Availability

Ask:

  • Is cost available for margin?

  • Is employee commission stored or calculated?


STEP 3: Identify Required Reports (5 Reports)

✅ Final Agreed Reports

  1. Sales Performance Report

  2. Customer Analysis Report

  3. Regional Performance Report

  4. Employee Commission Report

  5. Product in Demand Report


STEP 4: Define KPIs for Each Report


πŸ“Š Report 1: Sales Performance

Business Questions

  • How much are we selling?

  • Are sales increasing?

KPIs

  • Total Sales Amount

  • Quantity Sold

  • Discount Amount

  • Gross Margin

  • Sales Growth %


πŸ‘₯ Report 2: Customer Analysis

Business Questions

  • Who are our top customers?

  • Are customers returning?

KPIs

  • Total Spend per Customer

  • Purchase Frequency

  • Average Basket Value

  • Customer Lifetime Value (basic)


🌍 Report 3: Regional Performance

Business Questions

  • Which regions perform best?

  • Where are we underperforming?

KPIs

  • Sales by Region

  • Sales Growth %

  • Contribution %

  • Store Count


πŸ§‘‍πŸ’Ό Report 4: Employee Commission

Business Questions

  • How much commission is earned?

  • Who are top performers?

KPIs

  • Total Sales by Employee

  • Commission Amount

  • Commission %

  • Rank by Sales


πŸ“¦ Report 5: Product in Demand

Business Questions

  • What products sell the most?

  • Which products should be stocked more?

KPIs

  • Quantity Sold

  • Sales Amount

  • Sales Frequency

  • Stock Turnover (if inventory exists)


STEP 5: Define Grain, Granularity & Granules


πŸ”Ή Fact Table Grain (Very Important)

One row per Product per Store per Day per Employee

This supports all 5 reports.


πŸ”Ή Granularity (Analysis Levels)

DimensionGranularity Levels
TimeDay → Month → Year
ProductProduct → SubCategory → Category
GeographyStore → City → Region
CustomerCustomer
EmployeeEmployee

πŸ”Ή Granules (Lowest Identifiers)

Granules are the keys that define the grain:

Date_Key Product_Key Store_Key Employee_Key Customer_Key

Each unique combination = 1 row


STEP 6: Choose the Data Model

✅ Recommended Model: Star Schema (with selective Snowflake)

Why?

  • BI friendly

  • Easy reporting

  • Snowflake only where hierarchy is complex (Product, Region)


STEP 7: Identify Fact Tables

πŸ”Ή Primary Fact Table

FACT_SALES

Measures:

  • Sales_Amount

  • Quantity

  • Discount

  • Cost

  • Commission_Amount


STEP 8: Identify Dimension Tables

πŸ”Ή Dimensions Used Across Reports

DimensionPurpose
DIM_DATETime analysis
DIM_PRODUCTProduct details
DIM_CATEGORY (Snowflake)Product hierarchy
DIM_STOREStore info
DIM_REGION (Snowflake)Geography hierarchy
DIM_CUSTOMERCustomer analysis
DIM_EMPLOYEECommission tracking

STEP 9: Data Model per Report (Mapping)


πŸ“Š Sales Performance

FACT_SALES → DIM_DATE → DIM_PRODUCT → DIM_STORE

πŸ‘₯ Customer Analysis

FACT_SALES → DIM_CUSTOMER → DIM_DATE

🌍 Regional Performance

FACT_SALES → DIM_STORE → DIM_REGION → DIM_DATE

πŸ§‘‍πŸ’Ό Employee Commission

FACT_SALES → DIM_EMPLOYEE → DIM_DATE

πŸ“¦ Product in Demand

FACT_SALES → DIM_PRODUCT → DIM_CATEGORY → DIM_DATE

STEP 10: Data Modeling Steps (Execution)

  1. Finalize grain

  2. List KPIs & measures

  3. Identify dimensions

  4. Define hierarchies

  5. Design star/snowflake schema

  6. Assign surrogate keys

  7. Validate against reports

  8. Review with business

  9. Freeze model

  10. Build ETL


🧠 Final Reality Check (Very Important)

Reports drive KPIs → KPIs drive grain → Grain drives data model

Never design tables first.


🎀 Interview-Ready Closing Line

“I start with business questions, derive KPIs, define the grain, and then design a star or snowflake schema that supports all required reports with minimal redundancy.”

No comments:

Post a Comment

23feb26 - cab service project -data modeling project

  The Full Data Modeling Implementation Steps Phase Description & Methodology Primary Tools Phase 1: Discovery Ask the 10-Section Questi...