Sunday, March 22, 2009

"Data Quality"- What , Who and how?

A data set which meet these below 4 conditions is a quality data set.

(A)Complete: must have all attributes, measures ( as agreed with customers).

(B)Correct: each attribute must carry right values.

(C)Consistent : each attribute of the data set must carry same meaning if it appears elsewhere in the enterprise either in the same form or any derived form.

(D)Available on time (as agreed with customers).

In other words, compromise on any one or more conditions produces poor quality data. Poor quality data does increase development and/or maintenance costs of adding new data, modifying or maintaining existing data.

What explains the existence of poor quality data? Either or combination of following reasons

1.Misaligned goals between Top Management and middle management.(Lack of vision, ability to translate that vision to execution and actions by Top Management team)

2.Lack of technical skills

Stakeholders responsible for data within an enterprise broadly fall under 2 categories:

(1) Middle Management and team members- Responsible for meeting conditions A, B part of C and part of D
(2) Top Management and Sr Management: Responsible for meeting largely conditions C and D

Misaligned goals between these 2 categories of people leads to sub optimal decisions. It is not uncommon that -sponsors and managers of programs- choose to compromise on quality as they do not think incremental costs of achieving required consistency and latency are justified. It's the top management's responsibility to understand importance of data quality, openly acknowledge and periodically communicate its importance to entire organization, align goals , and to provide right organization design with focus on data governance, data management initiatives, and best practices.