Data Warehousing
What is Data Lake? It's Architecture
What is Data Lake? A Data Lake is a storage repository that can store large amount of structured,...
Data reconciliation (DR) is defined as a process of verification of data during data migration. In this process target data is compared with source data to ensure that the migration architecture is transferring data. Data validation and reconciliation (DVR) means a technology that uses mathematical models to process information.
In this tutorial, you will learn,
In the Data migration process, it is possible for mistakes to be made in the mapping and transformation logic. Issues like run time failures like network dropouts or broken transactions can corrupt data.
This kind of errors can lead to data being left in an invalid state. These may create a range of issues like:
Here, are important reasons for using Data Reconcilliation Process:
Apart from above there are many advanatages/benefits of Data reconciliation.
Gross Error | Gross errors in measurements. It reflects only bias errors, instrument failures, or abnormal noise spikes if you are using only short time averaging period. |
Observability | Observability analysis can give you details about what variables can be determined for a given set of constraints and a set of measurements. |
Variance | Variance is a measure of the variability of a sensor. |
Redundancy | It helps you to determines which measurements should be estimated from other variables by using the constraint equations. |
Here, are essential landmarks from the history of Data Reconciliation.
Types of Data Reconciliation methods are:
Master data reconciliation is a technique of reconciling only the master data between source and target. Master data is mostly unchanging or slowly changing in nature, and no aggregation operation is done on the dataset.
Few common examples of master data reconciliation are:
Transactional data make the base of BI reports. Therefore, any mismatch in transactional data can directly impact the reliability of the report and the whole BI system in general.
Transactional data reconciliation method is used in terms of the total sum which prevents any mismatch caused by changing the granularity of qualifying dimensions.
Examples of measures used for transactional data reconciliation should be:
In large Data warehouse management system, it is convenient to automate the data reconciliation process by making this as an integral part of data loading. It allows you to maintain separate loading metadata tables. Moreover, automated reconciliation will keep all the stakeholders informed about the validity of the reports.
OpenRefine which is earlier known a Google Refine is a useful Database Reconciliation framework. It allows you to clean and transfer messy data.
Download link: https://openrefine.org/
This data reconciliation tool offers on-demand software services from the web in the form of Software-as-a-service. It allows users to validate the data, and cleansing data. It provides complete reconciliation testing features. Widely used in ETL process.
Download Link: https://clarity.cloud.tibco.com/landing/index.html
Winpure is an affordable and accurate data cleaning software. It allows you to clean a large amount of data, removing duplicates, correcting and standardizing to design the final data set.
Download Link: https://winpure.com/
What is Data Lake? A Data Lake is a storage repository that can store large amount of structured,...
Data modeling is a method of creating a data model for the data to be stored in a database. It...
Data visualization tools are cloud-based applications that help you to represent raw data in easy...
What is Data Warehousing? A Data Warehousing (DW) is process for collecting and managing data from...
What is Data Warehouse? A data warehouse is a blend of technologies and components which allows the...
Following are frequently asked questions in interviews for freshers as well experienced ETL tester and...