Continuing our series on best practice in data quality, we turn to customer data record quality reporting.
I hope you’ve been finding Paul’s series useful. It has certainly reminded me as to the need to focus on monitoring your data quality. It’s such an essential foundation to more advanced analytics work. Plus, our joint event with MyCustomer, reminded delegates of data quality as a GDPR requirement.
Next, in this post, Paul considers the need to also check the quality at data record level. It’s all very well individual fields being accurate, but how do you check whole records are valid?
Over to Paul, to complete his 3 part series, with recommended record-level checks…
Record-by-Record Quality Measures
This is the core measure at the Record Level. It requires the definition of a number of levels of completeness, for each type of record being managed. Typically, there will be three to five levels, often including:
- Complete: where all fields are populated to the quality levels from the
- Extended: where key value-added fields, over and above ‘Core’, are populated.
- Core: where all the basic fields (necessary for the record to support business processes) are populated.
- Enhanceable: where the record cannot currently be used, but could be enhanced to enable its use.
- Sub-Standard, the record should be deleted, or archived, as appropriate.
The proportions of records measured at each of these levels is the key indicator of the overall ‘health’ of each data table.
Most databases record when individual records are created, or updated. The time elapsed, since these events happening, normally forms the core of this measure. Ideally the organisation also needs to store the date on which, at least the key fields, were validated with / by the customer.
This is sometimes achieved by specific contact activity, to check details with customers. In other situations, it can be derived from other triggers, such as the successful delivery of a posted communication; which validates an address. Another example, would be the creation of a service case on a product holding, which validates that the holding is still current.
This is, effectively, a measure of duplication levels, in the tables of the database. The mechanisms, for identifying duplicates warrant, a whole article on their own.
They can vary from simple ‘exact match’ measures, to measures based on sophisticated ‘fuzzy’ algorithms. As the sophisticated matching activity can be time-consuming, and expensive to achieve, I have often recommended having two. A simple, frequent exact match measure and a periodic, more enhanced measure.
This is getting into the outer-reaches, of typical data quality monitoring, and is normally limited to detecting very specific quality challenges.
An example, for a Motor Manufacturer, would be detecting where the same Product (Vehicle) had been sold to multiple customers, with a ‘Vehicle Status’ of “New”. An example for an organisation, that leases business equipment, would be detecting where customers had apparently received service visits, on equipment they are not leasing. These rules would normally be individually defined and probably individually reported, as in the example.
Customer Data Record Quality
I hope you found those examples useful. Thanks again to Paul, for sharing his experience with us all.
How complete is your customer data quality reporting. How many of Paul’s recommended 11 field level metrics do you have in place? Do you have version of all 4 Customer Data Record quality reporting metrics in your reports?
It would be great top hear feedback on Paul’s recommended best practice. Do you recognise those quality metrics? Do you recommend any others from your experience?