What is Data Warehouse? Types
โก Smart Summary
Data Warehouse defines a centralized repository that consolidates information from many sources for analysis, reporting, and decision support. This tutorial explains the history, working model, types, stages, components, implementation steps, advantages, disadvantages, and prominent tools that make Data Warehousing the foundation of modern Business Intelligence.

What is Data Warehousing?
Data Warehousing (DW) is the process of collecting and managing data from varied sources to provide meaningful business insights. A Data Warehouse is typically used to connect and analyze business data from heterogeneous sources, and it is the core of any Business Intelligence (BI) system built for data analysis and reporting.
Data Warehousing is a blend of technologies and components that supports the strategic use of data. It is the electronic storage of large amounts of business information designed for query and analysis rather than transaction processing. The Data Warehouse transforms raw data into usable information and delivers it to users in time to make decisions that matter.
A decision-support database (the Data Warehouse) is maintained separately from an organization’s operational database. The Data Warehouse is not a product but an environment โ an architectural construct of an information system that provides users with current and historical decision-support information that is difficult to access or present in a traditional operational data store.
You may know that a 3NF-designed database for an inventory system typically has many tables related to each other. For example, a report on current inventory might require more than 12 joined conditions, which slows down query and report response times. A Data Warehouse provides a denormalized design that reduces response time and improves performance for reporting and analytics.
The Data Warehouse system is also known by the following names:
- Decision Support System (DSS)
- Executive Information System
- Management Information System
- Business Intelligence Solution
- Analytic Application
- Data Warehouse
History of the Data Warehouse
The Data Warehouse helps users understand and improve their organization’s performance. The need to warehouse data evolved as computer systems became more complex and processed ever-growing volumes of information. Data Warehousing is not a new concept โ it has a long evolutionary history.
Here are some of the key events in the evolution of the Data Warehouse:
- 1960 โ Dartmouth and General Mills, in a joint research project, develop the terms “dimensions” and “facts”.
- 1970 โ A.C. Nielsen and IRI introduce dimensional data marts for retail sales.
- 1983 โ Teradata Corporation introduces a database management system specifically designed for decision support.
- Late 1980s โ IBM researchers Paul Murphy and Barry Devlin develop the Business Data Warehouse concept.
- The modern Data Warehouse vision is credited to Bill Inmon, often called the “father of the Data Warehouse”. He authored foundational works on building, using, and maintaining the warehouse and the Corporate Information Factory.
How Does a Data Warehouse Work?
A Data Warehouse works as a central repository where information arrives from one or more data sources. Data flows into the warehouse from transactional systems and other relational databases.
The incoming data may be:
- Structured
- Semi-structured
- Unstructured
The data is processed, transformed, and ingested so that users can access the curated dataset through Business Intelligence tools, SQL clients, and spreadsheets. A Data Warehouse merges information from different sources into one comprehensive database.
By consolidating all of this information in one place, an organization can analyze its customers holistically and confirm it has considered every available data point. Data Warehousing makes Data Mining possible โ Data Mining searches for patterns in the data that lead to higher sales, lower costs, and better forecasts.
Types of Data Warehouse
The three main types of Data Warehouses (DWH) are:
1. Enterprise Data Warehouse (EDW):
An Enterprise Data Warehouse is a centralized warehouse that provides decision-support services across an entire organization. It offers a unified approach for organizing and representing data and gives the ability to classify data by subject and grant access according to those divisions.
2. Operational Data Store (ODS):
An Operational Data Store is a data store used when neither a Data Warehouse nor OLTP systems can serve the organization’s reporting needs. In an ODS, data is refreshed in real time, which makes it ideal for routine activities such as storing current employee records.
3. Data Mart:
A Data Mart is a subset of the Data Warehouse designed for a specific line of business, such as sales, finance, or marketing. In an independent Data Mart, data can be collected directly from source systems.
General Stages of a Data Warehouse
Initially, organizations adopted relatively simple uses of Data Warehousing. Over time, more sophisticated patterns emerged. The following are the general stages of Data Warehouse (DWH) usage:
Offline Operational Database:
Data is copied from an operational system to another server. Loading, processing, and reporting against the copy do not impact the operational system’s performance.
Offline Data Warehouse:
Data in the Data Warehouse is regularly updated from the operational database. The data is mapped and transformed to meet Data Warehouse objectives.
Real-Time Data Warehouse:
Data Warehouses are updated whenever a transaction takes place in the operational database. Airline and railway booking systems are classic examples.
Integrated Data Warehouse:
Data Warehouses are updated continuously when operational systems perform transactions. The Data Warehouse then generates transactions that are passed back to the operational system.
Components of a Data Warehouse
The four components of a Data Warehouse are:
Load Manager: Also called the front component, the Load Manager handles all operations associated with extraction and loading of data into the warehouse. These operations include the transformations that prepare data for entry into the Data Warehouse.
Warehouse Manager: The Warehouse Manager carries out operations associated with managing data inside the warehouse. It analyzes data to ensure consistency, creates indexes and views, generates denormalizations and aggregations, transforms and merges source data, and archives or backs up data.
Query Manager: Also known as the backend component, the Query Manager handles operations related to user queries. It routes queries to the appropriate tables and schedules their execution.
End-User Access Tools:
These tools fall into five groups: 1) Data Reporting, 2) Query Tools, 3) Application Development Tools, 4) EIS tools, and 5) OLAP tools and Data Mining tools.
Who Needs a Data Warehouse?
A Data Warehouse (DWH) is needed for all types of users, including:
- Decision-makers who rely on large volumes of data.
- Users who run customized, complex processes to combine information from multiple sources.
- People who want simple, low-friction technology to access data.
- Teams that want a systematic, repeatable approach to making decisions.
- Users who need fast performance on huge datasets for reports, dashboards, grids, or charts.
- Analysts looking to discover hidden patterns in data flows and groupings.
What Is a Data Warehouse Used For?
Below are the most common sectors where Data Warehouses are used:
Airline:
In the airline industry, Data Warehouses support crew assignment, analysis of route profitability, frequent-flyer program promotions, and similar operational decisions.
Banking:
Data Warehouses are widely used in banking to manage desk resources effectively. Several banks also use them for market research, product performance analysis, and operations planning.
Healthcare:
The healthcare sector uses Data Warehouses to strategize and predict outcomes, generate patient treatment reports, and share data with insurance partners and medical-aid services.
Public Sector:
In the public sector, Data Warehouses support intelligence gathering and help government agencies maintain and analyze tax records and health policy records for every individual.
Investment and Insurance:
In this sector, warehouses are used to analyze data patterns, customer trends, and market movements.
Retail Chains:
Retail chains use Data Warehouses for distribution and marketing, to track items and customer-buying patterns, plan promotions, and determine pricing policy.
Telecommunications:
Telcos use Data Warehouses for product promotions, sales decisions, and distribution decisions.
Hospitality Industry:
The hospitality industry uses Data Warehouses to design and estimate advertising and promotion campaigns that target clients based on feedback and travel patterns.
Steps to Implement a Data Warehouse
The best way to manage the business risk associated with a Data Warehouse implementation is to follow a three-prong strategy:
- Enterprise Strategy: Identify the current technical architecture and tools, plus the facts, dimensions, and attributes the warehouse must support. Data mapping and transformation are part of this stage.
- Phased Delivery: Implement the Data Warehouse in phases based on subject areas. Related business entities such as booking and billing should be delivered first and then integrated.
- Iterative Prototyping: Rather than a big-bang implementation, the Data Warehouse should be developed, tested, and refined iteratively.
Here are the key steps in Data Warehouse implementation along with their deliverables:
| Step | Tasks | Deliverables |
|---|---|---|
| 1 | Define project scope | Scope Definition |
| 2 | Determine business needs | Logical Data Model |
| 3 | Define Operational Data Store requirements | Operational Data Store Model |
| 4 | Acquire or develop extraction tools | Extract Tools and Software |
| 5 | Define Data Warehouse data requirements | Transition Data Model |
| 6 | Document missing data | To-Do Project List |
| 7 | Map Operational Data Store to Data Warehouse | DW Data Integration Map |
| 8 | Develop Data Warehouse database design | DW Database Design |
| 9 | Extract data from Operational Data Store | Integrated DW Data Extracts |
| 10 | Load the Data Warehouse | Initial Data Load |
| 11 | Maintain the Data Warehouse | Ongoing Data Access and Subsequent Loads |
Best Practices to Implement a Data Warehouse
- Define a plan to test consistency, accuracy, and integrity of the data.
- The Data Warehouse must be well integrated, well defined, and time-stamped.
- When designing the Data Warehouse, use the right tool, follow the lifecycle, address data conflicts early, and learn from mistakes.
- Never replace operational systems and operational reports with the warehouse.
- Do not spend excessive time on extracting, cleaning, and loading data โ automate where possible.
- Involve all stakeholders, including business teams, in the Data Warehouse implementation. Treat the warehouse as a joint project so it remains useful to end users.
- Prepare a training plan for the end users.
Advantages and Disadvantages of a Data Warehouse
Advantages of a Data Warehouse (DWH):
- Business users can quickly access critical data from many sources in one place.
- Provides consistent information across cross-functional activities and supports ad-hoc reporting and queries.
- Integrates multiple data sources, which reduces the load on production systems.
- Reduces total turnaround time for analysis and reporting.
- Restructuring and integration make reporting and analysis easier for the end user.
- Saves the user’s time by removing the need to retrieve data from multiple sources.
- Stores large volumes of historical data, which enables trend analysis and future predictions.
Disadvantages of a Data Warehouse:
- Not an ideal option for unstructured data.
- Creation and implementation is time-consuming.
- Data Warehouses can become outdated relatively quickly without active maintenance.
- Changes to data types, ranges, source schemas, indexes, or queries are difficult.
- The Data Warehouse may appear easy on the surface but is complex for the average user.
- Despite best efforts, project scope tends to expand during implementation.
- Different business units sometimes develop conflicting business rules.
- Organizations must allocate significant resources for training and rollout.
The Future of Data Warehousing
- Regulatory constraints may limit the ability to combine sources of disparate data, including unstructured data that is harder to store and govern.
- As the size of databases grows, what counts as a very large database keeps shifting upward, and building and operating warehouses at that scale is increasingly complex.
- Multimedia data cannot be manipulated as easily as text. Relational software handles textual information well, but rich media remains an active area of research.
Data Warehouse Tools
There are many Data Warehousing tools available in the market. Here are some of the most prominent:
1. MarkLogic:
MarkLogic is a Data Warehousing solution that simplifies and accelerates data integration with a rich set of enterprise features. It performs complex search operations and can query documents, relationships, and metadata.
2. Oracle:
Oracle is the industry-leading database. It offers a wide range of Data Warehouse solutions for both on-premises and cloud deployments and helps optimize customer experience by improving operational efficiency.
3. Amazon Redshift:
Amazon Redshift is a Data Warehouse service that is simple and cost-effective for analyzing data with standard SQL and existing BI tools. It runs complex queries against petabytes of structured data using query optimization techniques.
Here is a complete list of useful Data Warehouse Tools.

