Table of Contents
- It is a digital architecture that serves many purposes
- Data warehouse architecture-overview
- Characteristics of data warehouse
- Data warehouse architecture
- One-tier architecture
- Two-tier architecture
- Three-Tier Architecture
- Key takeaways
It is a digital architecture that serves many purposes
A data warehouse architecture defines the overall architecture of data communication, processing, and presentation for end-client computing within the enterprise. Each data warehouse is unique, but they all share certain essential components. Online transaction processing is intended for production applications such as payroll, accounts payable, product purchasing, and inventory control (OLTP). These apps collect detailed data from day-to-day operations.
Data warehouse architecture-overview
Data Warehouse applications are designed to support ad-hoc data requests from users by using a process known as online analytical processing (OLAP). Forecasting, profiling, summary reporting, and trend analysis are such applications.
Production databases are constantly updated, either manually or through OLTP applications. On the other hand, a warehouse database is updated from operational systems regularly, usually during off-hours. As OLTP data accumulates in production databases, it is regularly extracted, filtered, and loaded into a user-accessible warehouse server. As the warehouse is filled, tables must be restructured, data must be cleansed of errors and redundancies, and new fields and keys must be added to reflect the user’s needs for sorting, combining, and summarising data.
Characteristics of data warehouse
Data Warehouse Concepts have the following features
A data warehouse is subject-oriented because it provides information about a specific theme rather than a company’s ongoing operations. These topics might include sales, marketing, distribution, and so on.
A data warehouse is never concerned with ongoing operations. Instead, it emphasizes data modeling and analysis for decision-making. It also provides a simple and concise view of the subject by excluding data that is not useful in supporting the decision-making process.
Integration in a data warehouse refers to establishing a common unit of measure for all similar data from disparate databases. The data must also be stored in the data warehouse in a consistent and universally acceptable format.
A data warehouse is created by combining data from various sources such as a mainframe, relational databases, flat files, etc. It must adhere to consistent naming conventions, formatting, and coding.
This integration facilitates effective data analysis. It is necessary to ensure consistency in naming conventions, attribute measures, encoding structure, etc.
Consider the following scenario
In the preceding example, three different applications are labeled A, B, and C. Gender, date, and balance are examples of data stored in these applications.
However, each application’s data is stored in a unique manner.
- A gender field in Application A stores logical values such as M or F.
- In Application B, the gender field is a numerical value.
- In Application C, the gender field is stored as a character value.
The same is true for date and balance.
However, after the transformation and cleaning processes, all of this data is stored in the data warehouse in a common format.
Compared to operational systems, the data warehouse has a much longer time horizon. The data collected in a data warehouse is associated with a specific time period and provides information from a historical standpoint. It includes a time element, either explicitly or implicitly.
The record key structure is one such place where data warehouse display time variance occurs. Every primary key in the DW should contain an element of time, either implicitly or explicitly. For example, the day, week, month, and so on.
Another aspect of time variance is that data cannot be updated or changed once inserted into the warehouse.
Non-volatile data warehouses do not erase previous data when new data is entered into them.
Data is read-only and is refreshed regularly. It also aids in analyzing historical data and understanding what and when it occurred. It does not necessitate transaction, recovery, or concurrency control mechanisms.
Delete, update, and insert operations performed in an operational application environment are not performed in a data warehouse environment.
Only two types of data operations are performed in data warehouses.
- Data loading
- Data access
Data warehouse architecture
The concept of Data warehouse architecture might be difficult to understand because it is an information system that contains historical and commutative data from multiple sources. There are three methods for building data warehouse layers- single-tier, two-tier, and three-tier.
The data warehouse’s three-tier architecture is described below.
A single layer’s goal is to store as little data as possible. The goal is to eliminate data redundancy. In practice, this architecture is rarely used.
One of the data warehouse layers that separate physically available sources and the data warehouse are known as two-layer architecture. This architecture is not scalable and cannot accommodate many end users. It also has connectivity issues due to network constraints.
This is the most common data warehouse architecture.
It is divided into three levels-top, middle and bottom.
The top tier
The top tier is the front-end client layer. The top-tier tools and APIs connect to and retrieve data from the data warehouse. Query, reporting, managed query, analysis, and data mining tools are all possibilities.
In a data warehouse, the middle tier is an OLAP server implemented using either the ROLAP or MOLAP model. This application tier displays an abstract view of the database to the user. This layer also serves as a go-between for the end-user and the database.
The data warehouse servers’ databases serve as the bottom tier. Typically, it is a relational database system. Data is cleansed, transformed, and loaded using back-end tools into this layer.
- Data warehouse architecture defines the overall architecture of data communication, processing, and presentation for end-client computing within the enterprise.
- Data Warehouse applications support ad-hoc data requests from users by using a process known as online analytical processing (OLAP). Forecasting, profiling, summary reporting, and trend analysis are such applications.
- It is an information system that contains historical and commutative data from multiple sources. Three methods are utilized for building data warehouse layers- single-tier, two-tier, and three-tier.
Did you find this blog informative? If so, please share your thoughts in the comments section below. Click here to contact us for more information on the data warehouse architecture. We would be happy to assist you with your queries.
Liked this blog? Read next: An overview of DBMS architecture
Q1. What are the 5 components of a data warehouse?
Ans- A typical data warehouse consists of four major components-a central databases, ETL (extract, transform, and load) tools, metadata, and access tools. These components are designed to be fast, allowing you to get results quickly and analyze data on the fly.
Q2. What are the 3 types of data sets?
And- When it comes to data sets, they are divided into record data, graph-based data, and ordered data.
Q3. What is a data warehouse used for?
Ans- A data warehouse is specifically designed for data analytics, which entails reading large amounts of data to understand relationships and trends. A database captures and stores data, such as transaction details.