DBMS Topics
Data Warehouse
Last Updated : 21 May, 2026
A Data Warehouse DW is a large, centralized repository of integrated data from multiple sources, optimized for analytical querying and business intelligence BI — not for
What is a Data Warehouse?
A Data Warehouse (DW) is a large, centralized repository of integrated data from multiple sources, optimized for analytical querying and business intelligence (BI) — not for day-to-day transaction processing.
Defined by W.H. Inmon (1990): "A subject-oriented, integrated, non-volatile, and time-variant collection of data in support of management's decisions."
Four Key Characteristics (Inmon's Definition)
Data Warehouse vs OLTP Database
| Feature | OLTP Database | Data Warehouse |
|---|---|---|
| Purpose | Daily operations | Analysis & reporting |
| Data freshness | Real-time | Batch updates |
| Query type | Simple, fast | Complex, aggregations |
| Data volume | Gigabytes | Terabytes-Petabytes |
| Schema | Normalized (3NF) | Denormalized |
| Optimization | Write-heavy | Read-heavy |
| Users | Clerks, systems | Analysts, managers |
| History | Current data | Historical (years) |
| Transactions | Thousands/sec | Few long queries |
Data Warehouse Architecture
| │ Source Systems | │ |
| │ | Extract from source systems │ |
| │ | Transform (clean, standardize, integrate)│ |
| │ | Load into DW │ |
ETL Process
ETL stands for Extract, Transform, Load — the pipeline that feeds data into a data warehouse.
EXTRACT
→ Pull data from multiple heterogeneous sources
→ Sources: Oracle DB, MySQL, CSV files, REST APIs, SAP
→ Handle different formats, access methods
TRANSFORM
→ Clean: Remove duplicates, fix missing values, correct errors
→ Standardize: "M"/"Male"/"male" → "Male"
→ Integrate: Combine data from multiple sources
→ Aggregate: Pre-compute summaries (monthly totals)
→ Enrich: Add derived attributes (age from DOB)
LOAD
→ Insert transformed data into the data warehouse
→ Full load (initial) or incremental (delta) loads
→ Typically run nightly or weekly (batch process)
Data Warehouse Schemas
Star Schema
The most common DW schema — a central fact table surrounded by dimension tables.
Snowflake Schema
An extension of star schema where dimension tables are normalized.
OLAP (Online Analytical Processing)
OLAP enables analysts to view data from multiple dimensions.
OLAP Operations
| Data Cube | Sales data organized by Product × Region × Time |
| Day | Month → Quarter → Year |
| City | State → Country |
| Year | Quarter → Month → Day |
| Fix one dimension | "Show all data for Year=2024" |
| Fix multiple dimensions | "Year=2024 AND Region=North" |
OLAP Types
| Type | Description |
|---|---|
| MOLAP | Multidimensional OLAP — pre-computed cubes stored in special format |
| ROLAP | Relational OLAP — queries run against relational DW |
| HOLAP | Hybrid — summary in MOLAP cubes, detail in ROLAP |
Data Mart
A data mart is a subset of a data warehouse focused on a specific business area (Sales, Finance, HR):
| ├── Sales Data Mart | Sales team access |
| ├── Finance Data Mart | CFO and finance team |
| ├── HR Data Mart | HR department |
| └── Marketing Data Mart | Marketing team |
| Benefits | Faster queries (smaller data), department-specific views, |
Exam Focus
Revise definitions, diagrams, examples, and short-answer points for Data Warehouse.
Interview Use
Prepare one clear explanation, one practical example, and one common mistake for this DBMS topic.
Search Terms
dbms, database management system, database notes, sql, unit, data, warehouse, data warehouse
Related DBMS Topics