Is it possible to make ORM value types redundant?

Last post Thu, Dec 15 2011 19:07 by sessaid. 33 replies.

Page 2 of 3 (34 items) < Previous 1 2 3 Next >
	Sort Posts: Previous Next

Tue, Jan 4 2011 16:03 In reply to

Matthew Curland
Joined on Sat, Mar 8 2008
Posts 450

Re: Is it possible to make ORM value types redundant?

Reply Contact

Wow, this goes beyond tangential into full-blown fragmentation. I'm tempted to withhold comment, but there are a few things that are screaming for a response.

Clifford Heath:
I realise that these are different encodings of the same value

Yes, they are different encodings of the same value. There is one underlying value. However, the existence of a single value does not mean that any of the following statements are true:

There is exactly one possible way to store that value. How a value is stored depends not only on encoding (ANSI, MBCS, UTF-8, UTF-16,...), but also low-level machine details such as big-endian vs little-endian bit ordering, disk compression, disk paging, memory addressing options, etc, etc, etc. Clearly, the value is not the equivalent of a single machine representation of that value.
There is exactly one possible way to represent the value to all users. The simple case of culture differences in decimal point interpretation makes this obvious: in English cultures "1,123" means 'one thousand one hundred and twenty three' in other European cultures (French, German, etc) the value means "one and one hundred twenty three thousands". Just because a user enters a value in a given way--generally influenced by their cultural bias--does not mean that the user-entered representation of the value is the definitive meaning of the represented value. Clearly, the value is more than just the set of characters entered by the user.
Only one representation of the value can exist. Displaying a single value more than one time is clearly possible. The bag {20,10,20,10,20} contains the value 10 twice and the value 20 three times. This does not mean that there are three values that mean 'twenty' and two that mean 'ten'. It simply means that a value can be used more than once. In fact, when we talk about values, we are actually talking about an instance of the value, not the value itself. Perhaps this is the fundamental issues: representations of values can be parsed (culture-dependent), stored (machine and encoding dependent), and rendered (including options such as culture, font, word wrapping, and other issues completely irrelevant to the meaning of the value), but we never actually have the underlying value.

If someone can give me the location of the actual number ten I'd be happy to reference it. Of course, this is not possible. We all understand what ten is, and this shared understanding is what gives meaning to the value--not a specific rendering of the value, machine encoding, or audible pronunciation. To quote Oscar Hammerstein: "How do you take a cloud and pin it down?". We can only reference and render instances of a value, not the value itself.

Clifford Heath:
ren't Farenheit and Celcius also just different encodings for temperature?

Absolutely not. Fahrenheit and Celcius are different units that provide a context for a value used to represent temperature. Temperature itself is not encoded, only the value used within a domain to represent a given temperature. You're playing fast and loose with the term encoding. A Temperature instance is no more equivalent to its value than a name-identified Person is equivalent to the identifying name.

Clifford Heath:
[V]alue types "have an agreed written form"

Unfortunately they don't. We can't even agree on how to write floating point numbers. Values have agreed upon written forms within a domain and culture only. 'Written forms' correspond to either user-provided data (generally entered on a computer, not in written forms) or a machine-provided rendering of the value. Conflating data and formatting into a notion of value is an extreme oversimplification of the problem.

Clifford Heath:
Matt wants to model this using data types (abstract types each having potentially multiple representations

I'm not quite sure what 'this' refers to here, but I am sure that this paragraph is not a representative overview of my inclinations in modeling data types. This discussion requires significantly more context that has not been provided here. There are many reasons to model data types and value types separately that have absolutely nothing to do with representation. For example, data types can represent infinite sets, value types are always finite. This makes a huge different when it comes time to logic evaluation of these systems. There are reasons that other systems separate data type and value (SQL:datatype/column, OOP:datatype/field, XSD:simpletype/attribute or element value, etc). From an ORM perspective, Person has Name is conceptually meaningful but Person has String is not, which is a pretty good indication that ValueType and DataType are not conceptually interchangeable.

-Matt

Tue, Jan 4 2011 16:18 In reply to

Re: Is it possible to make ORM value types redundant?

Reply Contact

Andy,

I agree with your principle for deciding what is a value and what is an entity - you make exactly the same point I did, and which is the reason that CQL uses the phrase "is written as" to introduce new value types. It still leaves a grey area though - most people would consider that you can write a Height for example, not just the number which indicates that Height.

However, that rule doesn't say anything about what value is encoded by a written representation, so your following argument is a non-sequitur. All literal values are subject to some encoding, and it's not the case for the majority that the value re-displayed upon entry will be the same character string we entered. Very often the value will be redisplayed in canonical form, yet we recognise it as the same value. The example I gave go beyond that, because in addition to rules for canonical representation, I assume a transfer encoding (exponential notation for 1E3, hexadecimal escapes for "hello, world"). The point is that there may be more than one valid representation of a single value

Which value is indicated by a representation depends on the context of the representation. It's just not valid to say definitively that 1000 <> 1E3 in all contexts, as you so confidently assert.

Tue, Jan 4 2011 16:27 In reply to

Matthew Curland
Joined on Sat, Mar 8 2008
Posts 450

Re: Is it possible to make ORM value types redundant?

Reply Contact

Hi Andy,

Thanks for a good laugh on the dog comment.

I think you know I don't agree with the value corresponding to the form used to enter it. A user-entered value is useful only when the data type of the character representation is known. Without knowing the data type, you might have a conceptual model that can be interpreted given some external context (meaning the model will have an affinity to the culture it was entered in), but you do not have a model that can be machine-interpreted or executed in any meaningful way.

I don't think there is much point in trying to determine if two different encodings of the same string are actually the same string. These are simply renders of different machine-readable representations of the same value. If you distinguish the value from its user and machine representations then this all becomes a moot point because the modeler sees an understable representation of the value that corresponds to a machine representation that the modeler does not interact with.

So, you can unambiguously enter the integer one thousand if the data type for the value is known and the user enters data type can be unambiguously translated by a machine into the integer 1000. The character sequence is used for convenience of representation, it is not the underlying value.

Please refer to other comments in my direct response to Clifford.

-Matt

Tue, Jan 4 2011 16:32 In reply to

Re: Is it possible to make ORM value types redundant?

Reply Contact

Matthew Curland:
However, the existence of a single value does not mean that any of the following statements are true

Of course not. Storage representation is just another representation. I'm glad you agree with me on that, it was the point I was trying to make. No representation "is" the value, as Andy seems to think.

Matthew Curland:
Clifford Heath:
[V]alue types "have an agreed written form"

Unfortunately they don't. We can't even agree on how to write floating point numbers.

Don't be silly. Of course we can agree. That doesn't mean that we do agree, or that we can expect or require others to agree. As I said, agreement depends on context, I'm not referring to any kind of universal agreement.

Matthew Curland:
There are many reasons to model data types and value types separately

Yes. But I'm not convinced that those reasons go to the conceptual semantics.

Matthew Curland:
Person has Name is conceptually meaningful but Person has String is not, which is a pretty good indication that ValueType and DataType are not conceptually interchangeable.

Mainly because String and Person come from different vocabularies, different domains of discourse. String belongs to the domain of writing, it carries connotations of writing systems, whereas Name belongs to the domain about which we intend to write. The domain of writing can also be modeled using ORM, and doesn't need any additional constructs. Just because there is an infinity of Strings that we can write doesn't make them fundamentally different from the non-infinite number of Names in our subject domain.

Tue, Jan 4 2011 17:32 In reply to

Andy Carver
Joined on Fri, Apr 25 2008
Colorado, U.S.A.
Posts 107

Re: Is it possible to make ORM value types redundant?

Reply Contact

Hi Clifford,

Your points are all well taken -- but only if we're assuming a computer implementation of the model. At the conceptual level, we're never assuming that. So, if (as I assume) this discussion is about ORM -- which is a CONCEPTUAL modeling language -- then I think my point still stands, notwithstanding that your comments are valid when it comes time to implement (where we do, of course, have to assume some particular mapping from a character set to a binary encoding). Please comment (and Matt, I'd love to hear from you on that).

Regards,

Andy

P.S. Matt, thanks for your comments. I intend to read your replies completely -- including the ones included in your replies to Clifford -- but I'm sitting at a Kinko's right now, as it happens, paying by the minute :-)

Wed, Jan 5 2011 15:42 In reply to

Matthew Curland
Joined on Sat, Mar 8 2008
Posts 450

Re: Is it possible to make ORM value types redundant?

Reply Contact

Clifford Heath:
No representation "is" the value

Agreed

Clifford Heath:
I'm not referring to any kind of universal agreement

I am referring to a universal agreement. There is a universal agreement on what is meant by a floating point number, but there is no universal agreement on how that value is displayed to the user. A modeling tool needs to be able to capture the meaning from a string in a specific data type and then render the meaning in a different context where the original string no longer represents the captured value. While agreement on what is meant by a given term in a model obviously depends on the domain of the model, we should not need to define any additional context or have any special agreement to determine the meaning of numeric data. As a tool vendor, I cannot realistically ask my users in France to change their culture settings so that they can read models entered on an en-US system.

Clifford Heath:
String belongs to the domain of writing, it carries connotations of writing systems, whereas Name belongs to the domain about which we intend to write.

While this is true, it is skirting the issue, which is that both Name and String are participating in the same model. Each Name value is pulled from the set of all String values, but String and Name have different conceptual classifications. Given the context of a populated domain, I can claim that the population of Name represents all known names in the domain. Additional names can be asserted, but they cannot be assumed to exist. I can also say that I know the population of String values used in the domain (Names, address parts, etc). However, it is clearly not true that the population of String in the domain is equivalent to the set of all known String values.

Clifford Heath:
Just because there is an infinity of Strings that we can write doesn't make them fundamentally different from the non-infinite number of Names in our subject domain.

There are a large number of theoretical differences between finite and infinite sets. Many Model Theory (MT) theorems fail for finite systems, and not all Finite Model Theory (FMT) conclusions are valid in infinite systems. From a theoretical perspective these differences are extremely important and cannot be dismissed as being fundamentally unimportant. The population of a ValueType (Name) must be asserted and is assumed to be complete at any point in time (only values asserted into a domain exist in that domain). By contrast, the population of String is always known to be incomplete, is never asserted, and is recursively defined. Given these differences, why would we assume that DataType and ValueType are collapsible in a metamodel of the system?

-Matt

Wed, Jan 5 2011 15:48 In reply to

Matthew Curland
Joined on Sat, Mar 8 2008
Posts 450

Re: Is it possible to make ORM value types redundant?

Reply Contact

Hi Andy,

The issue here is that the type of a value still has meaning whether the model is implemented with a computer or not. The difference is that the user takes on some of the computer roles, such as parsing strings into numbers. The type of the data (whole number, decimal number, string, etc) is still extremely important to the modeler even if the model is not being machine implemented. A business user simply does not consider "10.63" in a Price value to be a String. The user considers this to be a number, and I consider specifying types for values in the system to be part of a complete conceptual model.

-Matt

Wed, Jan 5 2011 20:14 In reply to

Re: Is it possible to make ORM value types redundant?

Reply Contact

Matthew Curland:

Clifford Heath:
I'm not referring to any kind of universal agreement

I am referring to a universal agreement.

I suggest that instead of hoping for universal agreement, you merely need a set of representations needed for your tool, and a set for interchange purposes, nothing more.

Matthew Curland:
There is a universal agreement on what is meant by a floating point number

And how to convert it to decimal digits and back - that's part of the IEEE standard. The remaining issues are tool and interchange issues, not conceptual modeling ones.

Matthew Curland:

Clifford Heath:
String belongs to the domain of writing, it carries connotations of writing systems, whereas Name belongs to the domain about which we intend to write.

While this is true, it is skirting the issue, which is that both Name and String are participating in the same model.

Not skirting the issue, merely separating concerns into application concepts and tool concepts. Note that I'm not saying we don't need to model data types - I'm just saying that this model is a tool model and/or an interchange model. The details aren't directly relevant to the application domain.

Matthew Curland:

Clifford Heath:
Just because there is an infinity of Strings that we can write doesn't make them fundamentally different from the non-infinite number of Names in our subject domain.

There are a large number of theoretical differences between finite and infinite sets.

I do appreciate that, and I don't think that my current short-cut of collapsing DTs and VTs together (which is different from NORMA's short-cut) is sustainable, any more than NORMA's is. I just don't think we need to pollute the core ORM model with what I consider to be, for the most part, tool and interchange concerns. The existence of value comparability and convertibility matter; the details don't.

Sat, Jan 8 2011 11:59 In reply to

Andy Carver
Joined on Fri, Apr 25 2008
Colorado, U.S.A.
Posts 107

Re: Is it possible to make ORM value types redundant?

Reply Contact

Dear Matt,

Now that I've gotten back to an Internet connection that I don't have to pay for by the minute, and had a chance to read your very worthy replies carefully, I am sorry I missed some of the very worthy comments and points you made. To which I have some degree of basic agreement, but don't draw (leap to?) the same conclusions you do (about data types, as allegedly essential to conceptual models). Let me redress my unfortunate absence, somewhat:

If I may, let me take this paragraph as presenting the nub of your response:

-- "I think you know I don't agree with the value corresponding to the form used to enter it. A user-entered value is useful only when the data type of the character representation is known. Without knowing the data type, you might have a conceptual model that can be interpreted given some external context (meaning the model will have an affinity to the culture it was entered in), but you do not have a model that can be machine-interpreted or executed in any meaningful way."

Your first two sentences here, present the position which the rest of the paragraph seeks to establish. But I don't think the rest of the paragraph actually establishes it. For one thing, I take it as a false premise, that a conceptual model or conceptual database must (for some reason) be capable of being "machine-interpreted". Do you have a computer so advanced that it's capable of interpreting fact-sentences from a database? I've never seen one, nor do I know why having one would be an advantage for database users... And, you seem to be begging the question as to whether we can have a reasonable database that is not computerized...!

I hope this suffices as a reasonable response for the present... actually I wrote a more detailed response that then got accidentally deleted ((^%**&^T% these superfluous buttons on my keyboard!), and I didnt have the heart to re-create the whole thing :-( But the above indicates the main point I wanted to make... Hope it helps,

Regards,

Andy

Sat, Jan 8 2011 22:41 In reply to

Andy Carver
Joined on Fri, Apr 25 2008
Colorado, U.S.A.
Posts 107

Re: Is it possible to make ORM value types redundant?

Reply Contact

Dear Matt,

One other statement, in your last sentence, needs comment. Here is your last sentence again:

"Without knowing the data type, you might have a conceptual model that can be interpreted given some external context (meaning the model will have an affinity to the culture it was entered in), but you do not have a model that can be machine-interpreted or executed in any meaningful way."

I cannot follow you, not only in your complaint about machines being unable to interpret (which has always, everywhere been the case, unless you want to say that the Semantic Web will overcome this problem), but also when you claim that "you do not have a model that can be ... [machine?-] executed in any meaningful way". Hmmm! A slight overstatement, no? If you have to know and specify to the computer, what data type you're using, in order to get any execution "in any meaningful way", how is it that there are whole programming languages that work just fine without data types?

Fondly yours,

Andy

Sat, Jan 15 2011 20:28 In reply to

mnnoon
Joined on Wed, Apr 16 2008
Lawndale, CA
Posts 60

Re: Is it possible to make ORM value types redundant?

Reply Contact

Hi Andy,It’s good to hear from real experts on this. I've asked this myself when I first started using ORM. What is the difference between a value type and an entity? But I think the tricky part is looking at the big picture which took me several years and which I continue to struggle with. I think you mentioned in one of your lectures that it depends on the domain which really helped me understand it better. Because you can go from a value type in one domain like names of people... Joe, Bob, George Washington, etc.. But you can switch from a value type to an entity and basically using someone else's mental model create a brand new domain. So George Washington can be a value type in one person’s model, and an Entity in someone else’s model or Universe of Discourse. The tricky part is distinguishing between these demarcations zones of domains and domain experts clearly. Something that says ... ok we are now using someone else’s model, or a different scheme/a or Weltanschauung. Also for a given universe of discourse I think that demarcation zone is expressed in the fact types themselves as the underlying meaning, not necessarily in the entity or the value type. For instance, any entity like Person can have Name() a nvarchar(50) defined, and I can use it repeatedly in many different Fact types, but it will have many different meaning depending on how the fact type combines it with different entities or even the same entity. Perfect example is a Person can have a first Name () and Person can have a last Name ().

Sat, Jan 15 2011 21:10 In reply to

Re: Is it possible to make ORM value types redundant?

Reply Contact

mnnoon:
For instance, any entity like Person can have Name() a nvarchar(50) defined, and I can use it repeatedly in many different Fact types, but it will have many different meaning depending on how the fact type combines it with different entities or even the same entity. Perfect example is a Person can have a first Name () and Person can have a last Name ()

This case is arguably correct (for example, my given name also occurs as a family name, and vice versa), but you cannot extend this usage of Name to say Company() has Name(), because a Company Name is not the same value type as a Person's Name. The strings may be the same (i.e. the data type instance may be equal), but company names are a different conceptual object type from personal names. It would not be logical, for example, to expect to ask "what company has the same name as this person" - or rather, when you do that, you're explicitly invoking a cross-domain join; there's an implicit type conversion between two distinct types. A parent who named their child after the company that employs them would be seen as rather weird.

Just because the data types of two value types are the same or comparable does not make them the same value type.

I must confess, in some respects this distinction was lost on me in some of my earlier work and in addition, in some of my models I deliberately ignore it in order to invite such cross-domain joins.

Fri, Jan 21 2011 18:56 In reply to

mnnoon
Joined on Wed, Apr 16 2008
Lawndale, CA
Posts 60

Re: Is it possible to make ORM value types redundant?

Reply Contact

This is one of those situations where I need a logical proof. But I think I can understand your logic. Are you saying this is a fundamental theorem in object role modeling, or it just bad practice to reuse value types between entities.

Sat, Jan 22 2011 12:59 In reply to

Re: Is it possible to make ORM value types redundant?

Reply Contact

Logical proof (that a model is correct or optimal) is in principle not even possible. It's only a model, after all - either it fits the real world well enough, or it doesn't. It's up to you to decide what well enough means. That said, some principles apply.

The essence of type theory requires that any individual member of a type can play all roles of that type. Can each possible Name be a person's Given Name? or Family Name? Could each Name be a Company Name? In principle, maybe - there are no laws about these things, at least where I live. However, you'd think that a mother was pretty strange who called her newborn child "Megasoft Holdings Limited"... or a company that called itself "Michael". So at least we can say that there are more types here than just the one generic Name type.

Further, although both Clifford and Heath can and do occur as both a Family Name and as a Given name, some Names only ever occur as one of those. For example, "van der Graaf" is only ever a family name. If all members of type A can play all roles of type B, but not vice versa, you have a subtype (A is a subtype of B). It's possible that any Given Name might occur as a Family Name, at least in some cultures. So in this case, you might argue for a subtype relationship between Given Name and Family Name. I would suggest however that Given Name is not a direct subtype of Family Name, rather that both derive from some common super-type, like Personal Name for example. The role involved here is "Person is called Personal Name", without being specific about whether that Personal Name is a Given Name or a Family Name. Pragmatically however, unless you had a need for a supertype role such as this, you wouldn't bother with subtyping at all; just create two separate types. Or just one type, if you need to ask a question where the context allows either. That's the beauty of modeling - you get to decide which aspects of the real world matter to you in this case. It's only a model, it can be accurate without being fully detailed. It can gloss over unimportant aspects of the real world.

If you anticipate the need to ask the question "which Names serve as both a Company Name and as a Given Name?", and that kind of question is important enough in your model, then make the Name supertype. If you also care about the fact that there are Company Names which cannot be Given Names, or vice versa, you will need the subtypes. If not, use a single type..

But don't get confused into thinking that just because two values are comparable, they have related types. The Temperature in degrees Celcius and a Height in millimeters are both numbers, and so are comparable in mathematics, but the comparison is meaningless. The two types are really not related except in mathematics. You aren't modeling mathematics however, but some other domain, so use separate value types.

I hope that helps.

Sat, Jan 22 2011 17:16 In reply to

Terry Halpin
Joined on Fri, Nov 23 2007
Maleny, Australia
Posts 154

Re: Is it possible to make ORM value types redundant?

Reply Contact

In the way I formalized ORM, top level entity types are mutually exclusive, but I allowed value types to have implicit supertypes, which I now regard to be based on the datatypes from which they draw their values. So I have no problem with equating an instance of FamilyName (e.g. "Darwin") with an instance of CityName. To me it's fine to ask "Whose family name is also a city name?".

Regarding Temperature and Height, these are different, mutually exclusive entity types, not numbers or even numerals, and cannot be compared. However if you wish to compare an instance of the value types CelsiusValue and HeightValue which both draw their values from compatible datatypes (e.g. "RealNumber"), then I have no problem with that.

Cheers

Terry

Page 2 of 3 (34 items) < Previous 1 2 3 Next >

The ORM Foundation