ETL Testing – Basics
ETL basically stands for Extract Transform Load – which simply implies the process where you extract data from Source Tables, transform them in to the desired format based on certain rules and finally load them onto Target tables. There are numerous tools that help you with ETL process – Informatica, Control-M being a few notable ones.
In ETL Testing, the following are validated:
- Data File loads from Source System on to Source Tables
- Transform Process that is designed to extract data from Source tables and move them to Staging tables
- Data Validation of all Mapping Rules/Transformation Rules within the Staging tables
- Data Validation within Target tables to ensure data is present in required format and there is no data loss from Source to Target tables
So ETL Testing implies – Testing this entire process using a tool or at table level with the help of test cases and Rules Mapping document.
Typically, data that is loaded into a data warehouse is derived from diverse sources of operational data, which may consist of data from databases, feeds, application files or flat files. The data must be extracted from these diverse sources, transformed to a common format, and loaded into the data warehouse. Typically, it is further aggregated into a data mart for efficient reporting. The ETL (Extract, transform and load) process is a critical step in any data warehouse implementation, and continues to be an area of major significance whenever the ETL code is updated.
An effective data warehouse testing strategy focuses on the main structures within the data warehouse architecture:
1. The ETL layer
2. The data warehouse itself
3. Associated data marts
4. The front-end business intelligence/reporting applications
ETL Testing is categorized into four different engagements:
- New Data Warehouse Testing- a new data warehouse is built from ground up, gathering inputs from customer, extracting different data sources. This is verified with the help of ETL tools
- Migration Testing – In this type of engagement, migrating from the current ETL tool to a better option to improve efficiency
- Change Request – In this type of project new data is added from different sources to an existing DW. Also, there might be a condition where customer needs to change their existing business rule or they might integrate the new rule.
- Report Testing- Validating report layout, data in the report and calculation
ETL Testing Challenges
Some of the ETL Testing Challenges include:
- Environment Instability
- Response time from the query executed, the failure of the jobs, the data set up required for the FIT testing, volume testing
- Data selection from multiple source systems and analysis that follows pose great challenge
- Volume and the complexity of the data
- Inconsistent and redundant data in a data warehouse
- Inconsistent and Inaccurate reports
- Non-availability of History data
No comments:
Post a Comment