Data Quality Approach
Given the consequences of bad data, companies need to understand how to evaluate data so it best suits their needs. This includes establishing metrics and processes to assess data quality. According to an article on Data Assessment from Pipiano, Lee, and Wang, companies must strive for their data to score high in both objective assessments and subjective assessments.
In order to improve data quality organizations must complete the following:
- Evaluate objective and subjective data quality metrics
- Analyze the results and determine the reason behind any incongruities
- Determine next steps for improvement
Subjective assessments measure how stakeholders, analysts, collectors, etc, perceive the quality of the data. If a stakeholder is tasked with making a business decision based on a dataset they feel may be incomplete or inaccurate, this perception will ultimately affect the decision they make.
Objective Data Quality Assessments
Objective data quality looks at objective measurements recorded in the dataset, which can be evaluated in the context of the given task, or independently from a purely metrics-based perspective. To establish metrics by which to assess objective data, organizations can use principles to develop KPIs that match their needs, known as functional forms. When performing objective assessments, there are three ways to measure the different functional forms in terms of quality. These include:
This measures the total number of desired outcomes to the total possible outcomes. The range of this ratio is usually between 0 and 1, with 1 being the most desirable result.
Completeness and consistency can be measured through this ratio. However, both of these dimensions can be measured in different ways – so organizations need to determine criteria to best measure this.
Min or Max
This functional form is designed to handle multiple data quality variables.
The min is designed to be a more conservative number, while the max is a more liberal number. Variables such as appropriate level of data can be represented by min. Timeliness and accessibility can be represented by max.
This is an alternative to min and can be used when organizations comprehend the value each variable delivers into the equation.
After evaluating objective and subjective data quality metrics, organizations must take the next steps to improve their processes. Companies may find they are lacking data completeness or data quality. Below we will outline some best practices for overcoming data quality challenges.