What is ETL ?
1. ETL stands for Extract-Transform-Load.
Extract is the process of reading data from a source database/ transactional
system.
Transform is the process of converting the extracted data to required form.
Load is the process of writing the data into the target database/ analytical
system.
2. It is a process which defines how data is loaded from the source system to
the target system (data warehouse).
What is Data warehouse?
1. It is a database designed for querying and analysis rather than for transaction processing.
2. It separates analysis workload from transaction system.
3. This helps in:
i. Maintaining historical records
ii. Analyzing the data to gain a better understanding of the business and to improve the
business.
What is Data warehouse?
4. Data warehouse is a subject-oriented, integrated, time varying, non-volatile collection of
data in support of the management’s decision-making process.
i. Subject Oriented: This is used to analyze particular subject area.
ii. Integrated: This shows that integrates data from different sources.
iii. Time variant: Historical data is usually maintained in a Data warehouse, i.e. retrieval can be for
any period. In transactional system only the most recent/current data is maintained. But in the
Data warehouse recent/current and the previous/historical data is maintained.
iv. Non-Volatile: Once the data is placed in the data warehouse, it cannot be changed, which
means we will never be able to change the data
OLTP (Online Transaction Processing System):
1. OLTP is nothing but a database which actually stores the daily transactions which
are created from one and more applications.
2. Data in OLTP is called as the current data.
3. Mostly normalized data is used in OLTP system.
OLAP (Online Analytical Processing System) :
1. OLAP is use to store analytical data
2. It deals with analyzing the data for decision making and planning, designing etc.
3. Data in OLAP is called as the Historical data.
4. Mostly Denormalized data is used in OLAP system.