Data Warehousing
1
, What is a Data Warehouse?
Defined in many different ways, but not rigorously.
A decision support database that is maintained separately from the
organization’s operational database
Supports information processing by providing a solid platform of
consolidated, historical data for analysis.
“A data warehouse is a subject-oriented, integrated, time-variant, and
nonvolatile collection of data in support of management’s decision-
making process.”—W. H. Inmon
2
,Data warehousing:
The process of constructing and using data warehouses
Data Warehouse—Subject-Oriented
Organized around major subjects, such as customer, product,
sales
Focusing on the modeling and analysis of data for decision
makers, not on daily operations or transaction processing
Provide a simple and concise view around particular subject
issues by excluding data that are not useful in the decision
support process
3
, Data Warehouse—Integrated
Constructed by integrating multiple, heterogeneous data sources
relational databases, flat files, on-line transaction records
Data cleaning and data integration techniques are applied.
Ensure consistency in naming conventions, encoding
structures, attribute measures, etc. among different data
sources
When data is moved to the warehouse, it is converted.
Data Warehouse—Time Variant
The time horizon for the data warehouse is significantly longer
than that of operational systems
Operational database: current value data
4
1
, What is a Data Warehouse?
Defined in many different ways, but not rigorously.
A decision support database that is maintained separately from the
organization’s operational database
Supports information processing by providing a solid platform of
consolidated, historical data for analysis.
“A data warehouse is a subject-oriented, integrated, time-variant, and
nonvolatile collection of data in support of management’s decision-
making process.”—W. H. Inmon
2
,Data warehousing:
The process of constructing and using data warehouses
Data Warehouse—Subject-Oriented
Organized around major subjects, such as customer, product,
sales
Focusing on the modeling and analysis of data for decision
makers, not on daily operations or transaction processing
Provide a simple and concise view around particular subject
issues by excluding data that are not useful in the decision
support process
3
, Data Warehouse—Integrated
Constructed by integrating multiple, heterogeneous data sources
relational databases, flat files, on-line transaction records
Data cleaning and data integration techniques are applied.
Ensure consistency in naming conventions, encoding
structures, attribute measures, etc. among different data
sources
When data is moved to the warehouse, it is converted.
Data Warehouse—Time Variant
The time horizon for the data warehouse is significantly longer
than that of operational systems
Operational database: current value data
4