What is Data Reconciliation? Definition, Process, Tools
What is Data Reconciliation?
Data reconciliation (DR) is defined as a process of verification of data during data migration. In this process target data is compared with source data to ensure that the migration architecture is transferring data. Data validation and reconciliation (DVR) means a technology that uses mathematical models to process information.
Why is Data Reconciliation important?
In the Data migration process, it is possible for mistakes to be made in the mapping and transformation logic. Issues like run time failures like network dropouts or broken transactions can corrupt data.
This kind of errors can lead to data being left in an invalid state. These may create a range of issues like:
- Missing records
- Missing values
- Incorrect values
- Duplicated records
- Badly formatted values
- Broken relationships across tables or systems
Here, are important reasons for using Data Reconcilliation Process:
- The use of Data Reconciliation helps you for extracting accurate and reliable information about the state of industry process from raw measurement data.
- It also helps you to produces a single consistent set of data representing the most likely process operation.
- It also leads to inaccurate insight and issues with customer service.
- Reconciliation of data is also important for enterprise-control integration.
Apart from above there are many advanatages/benefits of Data reconciliation.
Terminology associated with Data Reconciliation
Gross Error | Gross errors in measurements. It reflects only bias errors, instrument failures, or abnormal noise spikes if you are using only short time averaging period. |
Observability | Observability analysis can give you details about what variables can be determined for a given set of constraints and a set of measurements. |
Variance | Variance is a measure of the variability of a sensor. |
Redundancy | It helps you to determines which measurements should be estimated from other variables by using the constraint equations. |
History of Data Reconciliation
Here, are essential landmarks from the history of Data Reconciliation.
- DVR ( Data validation and Reconciliation) started in the early 1960s. It was aimed at closing material balances in production where raw measurements were available for all variables.
- In the late 1960s, all the unmeasured variables were considered in the data reconciliation process.
- Quasi-steady state dynamics for filtering and parallel parameter estimation over time were introduced in 1977 by Stanley and Mah.
- Dynamic DVR was developed as a nonlinear optimization model which is issued by Liebman in the year 1992
Data Reconciliation Process
Types of Data Reconciliation methods are:
Master Data Reconciliation
Master data reconciliation is a technique of reconciling only the master data between source and target. Master data is mostly unchanging or slowly changing in nature, and no aggregation operation is done on the dataset.
Few common examples of master data reconciliation are:
- Total number of rows
- Total Customer in source and target
- Total number of Items in source and target
- Total count of rows based on given condition
- Number of active users
- Number of inactive users etc.
Accuracy of Activity
- You need to make sure that transactions are valid and are correct in purpose.
- Need to check if the transactions have been properly authorized.
Transactional Data Reconciliation
Transactional data make the base of BI reports. Therefore, any mismatch in transactional data can directly impact the reliability of the report and the whole BI system in general.
Transactional data reconciliation method is used in terms of the total sum which prevents any mismatch caused by changing the granularity of qualifying dimensions.
Examples of measures used for transactional data reconciliation should be:
- Sum of total income calculated from source and target
- Sum of the entire item sold, calculated from source and target, etc.
Automated Data Reconciliation
In large Data warehouse management system, it is convenient to automate the data reconciliation process by making this as an integral part of data loading. It allows you to maintain separate loading metadata tables. Moreover, automated reconciliation will keep all the stakeholders informed about the validity of the reports.
Best practices of Using Data Reconciliation
- Data reconciliation process should be aiming at correct measurement errors.
- Gross errors should be zero to make the data reconciliation process efficient.
- The standard approach of Data Reconciliation has relied on simple record counts to keep track of whether the targeted number of records have migrated or not.
- Data migration solution delivers similar reconciliation capabilities and data prototyping functionality which offers full volume data reconciliation testing.
Data Reconciliation Tools
1) OpenRefine
OpenRefine which is earlier known a Google Refine is a useful Database Reconciliation framework. It allows you to clean and transfer messy data.
Download link: https://openrefine.org/
2) TIBCO Clarity
This data reconciliation tool offers on-demand software services from the web in the form of Software-as-a-service. It allows users to validate the data, and cleansing data. It provides complete reconciliation testing features. Widely used in ETL process.
Download Link: https://www.tibco.com/
3) Winpure
Winpure is an affordable and accurate data cleaning software. It allows you to clean a large amount of data, removing duplicates, correcting and standardizing to design the final data set.
Download Link: https://winpure.com/
Summary
- Data validation and reconciliation (DVR) is a technology which uses mathematical models to process information.
- The use of Data reconciliation helps you for extracting accurate and reliable information about the state of industry process from raw measurement data.
- Gross Error, Observability, Variance, Redundancy are important terms used in Data Reconciliation process
- Data validation and Reconciliation started in the early 1960s.
- Three types of Data Reconcilation methods are 1) Master Data Reconciliation 2) Transactional Data Reconciliation 3) Automated Data Reconciliation
- Gross errors should be zero to make the data reconciliation process efficient.
- Some important Data Reconciliation tools are: 1)OpenRefine 2)TIBCO 3) Winpure
- This method widely used in performance and process monitoring in oil refining /nuclear/chemical industry