in

The ORM Foundation

Get the facts!

What is data quality?

Last post Sun, Mar 16 2014 16:40 by Anonymous. 7 replies.
Page 1 of 1 (8 items)
Sort Posts: Previous Next
  • Tue, Mar 4 2014 15:07

    • Ken Evans
    • Top 10 Contributor
      Male
    • Joined on Sun, Nov 18 2007
    • Stickford, UK
    • Posts 805

    What is data quality?

     According to Philip Crosby, "Quality is 'Conformance to requirements. "

    So if you don't have any requirements, then you can't have any quality data!

    So what are the requirements for data such that we can test for conformance?

     

  • Wed, Mar 5 2014 16:11 In reply to

    Re: What is data quality?

    Non sequitur. If you don't state your requirements, you can't measure your quality. That's different from not having quality, or not having requirements. The whole point of Crosby's dictum is that different situations have different requirements, so there can be no single standard set of requirements for data quality.

    That said, there are 25 pages on data quality in the DMBOK. Many concern the organisational processes around improving data quality, but there are some good definitions too. Section 12.2.3 is about defining data quality requirements. It defines Accuracy, Completeness, Consistency, Currency, Precision, Privacy, Reasonableness, Referential Integrity, Timeliness, Uniqueness, Validity. I don't agree with all the specific terms chosen - some overlap - but there are distinct concepts to each. Section 12.2.5 then recommends defining metrics, and 12.2.6 the business rules.

    This is a field with a lot of established practise, industry groups, professional certifications, and even dedicated conferences. I don't purport to be an expert, but I know enough to defer to those who are.

  • Thu, Mar 6 2014 3:09 In reply to

    • Ken Evans
    • Top 10 Contributor
      Male
    • Joined on Sun, Nov 18 2007
    • Stickford, UK
    • Posts 805

    Re: What is data quality?

    From pages 7-8 of Crosby's 1979 book "Quality is Free": 

    "The first struggle, and it is never over, is to overcome the 'conventional wisdom' regarding quality. In some mysterious way each new manager becomes imbued with this conventional wisdom. It says that quality means goodness; that it is unmeasurable; that error is inevitable...and that people just don't give a damn about doing good work. No matter what company they work for, or where they went to school, or where they were raised - they all believe something erroneous like this. But in real life, quality is something quite different. Quality is conformance to requirements; it is precisely measurable; error is not required to fulfill the laws of nature; and people work just as hard now as they ever did. ... What should be obvious from the outset is that people perform to the standards of their leaders."

    So Crosby's point is that the term "Quality" is a name for the relationship between two states of affairs: A desired state (The Requirements) and a measured state.(Conforms or does not conform).

    Consider the assertion "quality must lie at the heart of all that the NHS does". This is in the first paragraph of the letter sent by Sir David Nicholson to many NHS managers on 24 February 2010 in the wake of the Robert Francis QC Mid-Staffs Inquiry Report (A report about why many hospital patients died unnecessarily). 

    So do you think that Sir David meant "conformance to requirements must lie at the heart of all that the NHS does" ?  Well, his letter talks about many things including "failings", "scrutiny" and an undefined something called "quality of care". So it seems to me that Sir David's letter uses the term "quality" in accordance with its "conventional wisdom" meaning.

    Clifford Heath:
    Non sequitur. If you don't state your requirements, you can't measure your quality. That's different from not having quality, or not having requirements. The whole point of Crosby's dictum is that different situations have different requirements, so there can be no single standard set of requirements for data quality.
    .

    Following Crosby's philosophy your words "Not having quality" expand into "Not having conformance to requirements". So as I implied, Quality = f(requirements) and so No requirements (expressed using measurable metrics) = No quality.

    Now you mention the DMBOK. As I have previously told you, I have not read this book but that does not mean to say that I don't understand quality. e.g. I was involved IBM's internal quality program in the 1970's and 1980's. Since then I have held several management positions where I followed (and fought over) Crosby's philosophy and given many presentations about data fundamentals. You seem to be comparing the words in 25 pages of a book with the knowledge that I have gained in my 44 years of international experience in the IT world - most of which you don't know about.

    Lastly, "Requirements" are about the "What" not about the "How".  As I see it, a well specified requirement describes a desired state of affairs that can be measured. It has nothing to say about the methods or procedures that may be devised and used to conform to the requirement.

    Thus, in Crosby's philosophy, "quality" is a property of the methods that are used to conform to requirements. And, the possibility of devising appropriate methods is directly related to the way in which the requirements are expressed. If your requirements make careful use of metrics then you are on the right track. If you use ambiguous prose then you don't have a hope. (In this regard, I'll read the DMBOK and let you know what I find.)

     

    We can't define anything precisely. If we attempt to, we get into that paralysis of thought that comes to philosophers… one saying to the other: "you don't know what you are talking about!". The second one says: "what do you mean by talking? What do you mean by you? What do you mean by know?"
    Read more at http://www.notable-quotes.com/f/feynman_richard.html#7lKA1Rc6eAoL6I1q.9
    We can't define anything precisely. If we attempt to, we get into that paralysis of thought that comes to philosophers… one saying to the other: "you don't know what you are talking about!". The second one says: "what do you mean by talking? What do you mean by you? What do you mean by know?"
    Read more at http://www.notable-quotes.com/f/feynman_richard.html#7lKA1Rc6eAoL6I1

     

    Filed under:
  • Thu, Mar 6 2014 17:28 In reply to

    Re: What is data quality?

    Ken Evans:
    Clifford Heath:
    Non sequitur. If you don't state your requirements, you can't measure your quality. That's different from not having quality, or not having requirements.
    . Following Crosby's philosophy your words "Not having quality" expand into "Not having conformance to requirements". So as I implied, Quality = f(requirements) and so No requirements (expressed using measurable metrics) = No quality.

    It's a very theoretical argument that does not accord with experience. Every business uses and consumes a lot of data which meets their needs, even though they haven't ever stated their requirements. That's quality in anyone's terms - so I disagree with Crosby. When business processes start to fail because the data was bad, they perceive the lack of quality and then may start to formulate their requirements more clearly. They did not start out implementing those business processes with data quality as a goal - quality is never the goal. It's always a means to some end. If the end is being reached, the idea of data quality never arises. This lack of perception does not imply non-existence, however. It's a philosophical difference, perhaps akin to Plato=rhetoric vs Aristotle=dialectic.

    Ken Evans:
    Now you mention the DMBOK. As I have previously told you, I have not read this book but that does not mean to say that I don't understand quality.

    I'm quite confident that the DMBOK will teach you nothing about quality. That's not its goal - nor is it to teach any of the other areas of subject matter.

    It also seems likely that you'll learn little of nothing about the organisational management of data quality, though data management is the primary subject matter of the book. However, the goal isn't really even to teach about data management, but to serve as a conceptual and linguistic framework for data professionals, so that when one refers to "master data management", another knows that's not the same thing as "reference data management", for example. I'm sure you would recognise this difference, even though you might not use the same words for them.

    The main goal of the DMBOK is to get all data professionals speaking the same language. If we want to speak to them of the benefits of ORM, we also must speak that language. That's why I have asked to (and been accepted as) a reviewer of the data modeling section in the new DMBOK. I hope I'll be able to expand the sections on ORM and fact-based modeling. In a broader picture though, I'd also like to see (or to help build) an ORM model of the whole domain of data management, based on analysis of the DMBOK. It would be a huge benefit in building automated tools for data management, as well as the linguistic and promotional advantages.

  • Wed, Mar 12 2014 6:22 In reply to

    • Ken Evans
    • Top 10 Contributor
      Male
    • Joined on Sun, Nov 18 2007
    • Stickford, UK
    • Posts 805

    Re: What is data quality?

     

    Clifford Heath:
    so I disagree with Crosby

    Crosby asserted that "Quality is conformance to requirements"
    So on what is your disagreement based?

    I'm sure that it is not your intention to assert that "Quality is non-conformance with requirements"

    So please explain the basis of your "disagreement"
    Do you have your own defintion of quality?

    If so how do you measure quality?

    Ken

     

  • Thu, Mar 13 2014 15:25 In reply to

    Re: What is data quality?

    Ken Evans:

    Crosby asserted that "Quality is conformance to requirements"
    So on what is your disagreement based?

    I disagree with the definition of "requirements" and "conformance". I would say rather that "quality is sufficient to meet the need". The need does not have to be expressed in formal requirements for quality to exist. Neither does quality have to be measured for it to exist.

    The measurement of quality can only be done after it is defined in formal requirements. However, the measurement is not the thing being measured, and the statements of requirement are the map, not the terrain.

  • Fri, Mar 14 2014 11:27 In reply to

    • Ken Evans
    • Top 10 Contributor
      Male
    • Joined on Sun, Nov 18 2007
    • Stickford, UK
    • Posts 805

    Re: What is data quality?

     

    Clifford Heath:
    quality is sufficient to meet the need

    Hmm, If you look up the principles of semiotics, you will find that words out of context just don't have any meaning (References: Saussure, Peirce, Sebeok)  amongst others. And in particular, written words are just character strings that don't have any intrinsic meaning. Of course people do interpret character strings as having meaning - but when people do this they are just using an arbitrary (and frequerntly ambiguous) social convention.

    Every word is just a social convention that has been invented by humans. Some of our words point to metaphysical things and others point to phenomena that existed long before humans came on the scene - we just gave names to physical phenomena such as gravity. However, words like "quality" don't point to things that have a physical existence - they are just concepts invented by humans.  

    Anyway, here is a way of decoding what you have said:


    I disagree with the definition of r and c. (r=requirements and c=conformance) - but you don't say why you disagree.
    Is it that you don't like the sound of the words or what? 

    Then you propose: 

    Clifford Heath:
    quality is sufficient to meet the need

    Which decodes as "q is sufficient to meet n" (q=quality and n=need) But you don't explain what q and n mean.

    Then you say:

    Clifford Heath:
    The need does not have to be expressed in formal requirements for quality to exist

    Which decodes into "n does not have to be expressed in f for q to exist."  (f=formal requirements).
    Embedded in this assertion is the assertion that "Quality exists." What do you mean by that assertion?

    Then you say: 

    Clifford Heath:
    Neither does quality have to be measured for it to exist

    Which decodes as: q does not have to be measured for it to exist.
    Which has the claim "Quality exists" embedded within it. (Just what is it here that "exists"?)

    Which brings us back to questions like "What is quality?"
    Which is a question of the general form "What is x?"

    And the typical answer to this type of question is "x is a name for y" where y is a character string of arbitrary length which usually invokes one or more social conventions.

    This what Crosby was getting at when he said that many people use the word "quality" to invoke the social convention of "goodness". (Which is really a bit of disguised metaphysics)

    When you think about it, all that is happening is that one mystery is being explained in terms of another mystery.

    That's why you need measurable metrics instead of ponderous prose.

    Ken

    Filed under:
  • Sun, Mar 16 2014 16:40 In reply to

    Re: What is data quality?

    Ken, Your entire response is filled with words whose meaning you didn't define. Why? Because you're using English, and the words have commonly accepted meanings. "one mystery is being explained in terms of another mystery" is all that human communication is even theoretically capable of. Nonetheless, we seem to be amused to indulge in it, because we like to believe there's an objective world beyond just our personal experience of it. Ponderous prose indeed... did you even re-read your own posting?

    I'm not interested in playing with words. You know what I meant, and you know I'm right - that most organisations do successfully rely on some data for which the quality requirements have never been defined. Perhaps they'd be better off after defining requirements, but that's not the point in question here. There's nothing metaphysical about it - the business functions (in reality, not in abstract), and absence or corruption of the data would prevent that. No need to invoke semiotics or metaphysics - or the rest of your ponderous prose - to see that can and does actually happen.

    You are claiming that an organisation that relies on data cannot function (meet its goals, which is the definition of quality here) without defining the quality requirements for that data. I call bunkum, whatever words you wrap around it. Many such organisation clearly can and do function like that.

Page 1 of 1 (8 items)
© 2008-2024 ------- Terms of Service