in

The ORM Foundation

Get the facts!

Missing Relational View Table?

Last post 11-29-2012 17:30 by Anonymous. 8 replies.
Page 1 of 1 (9 items)
Sort Posts: Previous Next
  • 11-21-2012 12:06

    Missing Relational View Table?

    I've been using NORMA_VS2008_2012-04CTP Build 1487 to model university class meeting patterns for a physics education research database. The attached, zipped .orm and .docx files show the tables generated in the relational view. It seems that there may be a missing table that should be called Campus. The verbalization seems to imply that it should be there. The Building, CampusName and CampusIdentifier tables all have entries (i.e., CampusNr) that, while not specifically identified as such, could possibly be identified as foreign keys. Am I wrong about this, or is something actually missing?

    :-)

    Richard

  • 11-21-2012 13:23 In reply to

    Re: Missing Relational View Table?

    Campus plays no functional roles (single-column uniqueness constraint on a role played by Campus) outside of its identifier and no mandatory roles at all. You won't get a table in this case unless you mark Campus as IsIndepdent.

    There are  alot of ways to improve you model. I think you are way overusing a .Nr reference mode. In fact, I don't think there is any use of .Nr here that is worthwhile. For example, room has a number, but it cannot be the number displayed beside the room because it must be unique for all rooms. A room is generally identified by its number and the building it is located it, which would mean using a preferred external uniqueness constraint from the Building role of 'Room is located in Building' and the RoomNumber role of 'Room has RoomNumber'.

    Once you've established these conceptual identifiers, you may want to add surrogate keys using a .id reference mode, but get your model conceptually correct first.

    General, something called 'identifier' will be one-to-one with the thing it identifies. Currently you have one many-to-one identifier (for room) and two one-to-many identifiers (for campus and building).

    There are other oddities as well, such as BuildingName applying to at most one building. This means that you can't have a 'Student Union' or 'Physics Hall' on more than one campus.

    The bottom line here is that your relational model doesn't look like what you're expecting because the constraints in the conceptual model don't match the domain you're modelling. I'd recommend going over all of the verbalizations--especially those pertaining to your reference modes and internal uniqueness constraints. Also verif with some sample populations, then worry about the relational model.

    -Matt

  • 11-23-2012 16:21 In reply to

    Re: Missing Relational View Table?

    Matt,

     

    Thanks for the critique of my model. The advanced database design course I took from Gordon Everest in the spring of 2010 introduced me to ORM and NORMA. I was instantly sold on it because it filled a nagging discomfort with diving directly into an ER model. ORM provided  the ability to verbally express the physical situations I am trying to model. Now, I am faced with a real project with the Physics Education Research Group at the University of Minnesota. There has also been some interest by certain members of the University of St. Thomas physics department. This leaves me with a dilemma since I am venturing into new territory with limited experience. Since you have probably seen model fragments for many of these situations, any advice to improve the model is appreciated.

     

    Now, let’s turn to the model. I agree with your observation about the overuse of the .nr reference mode. When I first thought about the appropriate reference mode for Room, Building and Campus objects, I ran into several dilemma. For the Building object, .name reference mode didn’t fit because in the real world, building names change. For example, the UofM Electrical Engineering/Computer Science building was renamed a few years ago to Nils Hasslemo Hall and the Physics building is also known as the Tate Laboratory of Physics. A similar problem exists with the .id reference mode. While the first corresponding building identifier changed from EE/CSci to NHH, the second corresponding identifier has always been Phys. This still leaves the problem of selecting an appropriate reference mode that satisfies the requirement for a 1:1 association. (In this context, I use the word “association” rather than “relationship” because the latter might be confusing since it is also use in ER modeling with a different connotation.)

     

    Since, there is a 1:n association between the actual building and its name and/or identifier values and I wanted to provide for their values in the model, I introduced the value objects  RoomIdentifier, BuildingName, BuildingIdentifier, CampusName and CampusIdentifier. The distinction here is that these are simply values that are associated with one or more of the objects with which they are associated. I’m not sure whether this makes sense – perhaps an example will illustrate the distinction I am trying to convey. Assume we have an ORM object called Room(.id). The .id values referred to here have a 1:1 association with a room. They might be some identifier values from a facilities management organization or they might be assigned by a numeric auto counter. There is also a RoomIdentifier value object with values that are the room numbers appearing outside the door to a room. These values can be in a n:1 association.

     

    As you mentioned, using a preferred external uniqueness constraint makes sense between the Building role of 'Room is located in Building' and the RoomNumber role of 'Room has RoomNumber'. I need to think about a similar constraint between Building and Campus, or perhaps all three. I’m not sure since the internal uniqueness constraints are different.

     

    As a newbie to ORM, you might have guessed that I am struggling, so I appreciate any help you can provide. I will be refocusing on the verbalizations and internal uniqueness constraints rather than the resulting relational model created by NORMA.  One question – any advice on additional reading (books, online sources, etc.) about ORM and modeling in general – beyond the Big Brown Book. I don’t like to reinvent the wheel and I’m always looking for good examples  ORM fragments that may have addressed some of the issues related the model I am working on. I suspect that these are issues that have already been addressed by other modelers.

     

    :-)

    Richard 

  • 11-24-2012 4:55 In reply to

    Re: Missing Relational View Table?

    Hello Richard,

    Here is my 2 cents worth which I hope will be of help.

    I'll pick up on what Matt said "get your model conceptually correct first" and use room number as an example,

    It is true that all rooms in the world are unique. Thus, some people might argue that this means that using a GUID or an integer to "identify" a room is OK. However, it is not the case that a domain expert would recognise a link between the GUID or the integer and the room because these are not "conceptual " identifiers." 

    One way of finding a conceptual identifier for a room is to ask some one who uses the room (aka a domain expert) to make statements about the room and its location.  Thus, someone might say, "The room on the left at the top of the stairs to the first floor in the Science building. " They might also say "Room MB1023" or room "BL1107."

    Now it is the case that rooms MB1023 and BL1107 do exist. However the room numbers are what might be called "local" names because they refer to rooms on the campus of the University of Lincoln in the UK. Room MB1023 is in the "Main Building on the first floor" and Room BL1107  is in the building with the name "Business and Law"

    Now suppose you met someone by chance whilst you were a passenger on a plane and in the conversation the person said, "My most recent lesson was in room MB1023" .  Being curious, you might ask, "Where is that?" and you might get the response "It's on the first flloor of the Main Building." You might follow up with "Which Main Building is that?" and so on.

    Eventually you would end up with a UK postal address and a description of the layout of the buildings and rooms on one of the campuses of the University of Lincoln.

    From this, you might get the following fact types:

    Room(.code) is in Building(.name) on Campus(.name) of University(.name)

    University(.name) is at Address(.nr)

    Address(.nr) is in Country(.code)  and so on...

    This is an example of what Matt meant by "get your model conceptually correct first".

    Ken

    Filed under:
  • 11-24-2012 13:12 In reply to

    Re: Missing Relational View Table?

    Ken,

     Aha - Insight! I think that I am beginning to see a glimmer of light in what you and Matt are emphasizing. Your examples are most useful; although, I need to ponder them more. Please correct me if I'm wrong, but the first thought is that when considering reference modes, I should be thinking about using ones that make sense in the real world - Rooms have numbers that might be thought of as codes, Buildings are most often referred to by their names, as are Campuses and Institutions. The next thought is that, while I have considered internal uniqueness constraints, I may need to reevaluate them and, moreso, pay attention to external uniqueness constraints. Furthermore, and this is something that has received almost no attention, I need to look at mandatory role constraints.

    Thanks to both you and Matt. That said, I pose the same question to you that I asked of Matt – any advice on additional reading (books, online sources, etc.) about modeling in general and/or ORM – beyond the Big Brown Book?

     :-)

    Richard

  • 11-24-2012 14:30 In reply to

    Re: Missing Relational View Table?

    Hi Richard,

    I'm glad that you found the examples to be helpful.

    Regarding reference modes. As a matter of principle, I use "real world" reference modes wherever possible.
    So in my earlier example, if you stand outside room MB1023 you will see a notice on the door that says "MB1023".
    However, I have used ".nr" for address because in the "real world" there isn't a  single column reference for each address so I made one up.

    One other thing that I should have pointed out is that what we call an "Atomic" or "Elementary" fact type is one that cannot have a smaller number of object types without loosing information. (i.e. the arity of a fact type has semantic significance.)

    So, in my example, one of the fact types has an arity of 4. (Has four "holes" for object types)

    However, as an object-role modeler you need to be very mindful of the scope of the Universe Of Discourse (UoD) because the scope is defined by the context of the model. For example, if you have been asked to make a model for just one university, then

    this arity 4 fact type: Room(.code) is in Building(.name) on Campus(.name) of University(.name)

    reduces to this arity 3 fact type: Room(.code) is in Building(.name) on Campus(.name)

    Why is that?

    Well it is because the property of "atomicity" is related to the scope of the UoD.

    If your model is for just one university, then the object-type "University(.name)" is not needed because all of the fact types in your model will refer to the same university - which is implicit in the predefined scope and thus is redundant.

    Regarding additional reading: Firstly, I recommend that you thoroughly go through the BBB before looking at other stuff. You might also look up some of the papers that are referenced in the "ORM Events" section of the Library.  

    Over the years, I have read and accumulated dozens of books on data modeling but the term "modeling in general" - probably includes thousands of books and many more scientific papers. 

    Probably the easiest way to find some of them is to go to Amazon and search for "Data modeling".

    My suggestion is that you pose a more precise question and I'll see what I can come up with.

    And when you are thinking about your "more precise question", may I suggest that you begin by asking yourself the question "What problem am I trying to solve."  (Hint: A problem is something that prevents you from achieving an objective.)

    i.e. Where are you trying to get to and what's in the way? 

    Ken  

  • 11-28-2012 20:17 In reply to

    Re: Missing Relational View Table?

    Ken,

    Good points.
    Let me give you a little background on this project. As a semi-retired computer/communications
    system engineer with a longstanding interest in education and in physics, I
    became interested in the work being done by the physics education research
    group at the University of Minnesota.

    One aspect of our research is the measurement of conceptual understanding of physics
    principles by students taking introductory physics courses. To that end, we administer
    tests (e.g., the Force Concept Inventory, etc.) and surveys at the beginning
    and end of each academic term. We have data that goes back to the mid-1990s; however,
    it is not well organized (e.g., in flat files and spreadsheets).

    I am trying to create a database that will organize this data, along with data related to
    courses, classes, people, grades, pedagogy, textbooks and anything else that we
    can think of which might affect student understanding. Currently, I have two
    objectives for the database - first, to be able to generate reports for
    distribution to faculty and second, to be able to extract information that
    might provide insight into how we can improve the education process. The first
    objective is easier to achieve because reports can be generated
    algorithmically. The second objective is more problematic because it is
    difficult to anticipate what questions might be asked of the data. In addition,
    the database design must be flexible enough to add new types of data that might
    be of interest. That is why it is important to get a good conceptual model of
    the UoD from the beginning.

    What you have seen is only a small fragment of the total model. However, it is one that
    has caused me a lot of problems because I have found several similar, yet
    different, ways of looking at the UoD when there are constraints on when a
    class is held and where it is held. This, in turn, affects how the conceptual model
    is developed. I've been working on this for a while and have already taken some
    of your recommendations into consideration. For example, I have limited the UoD
    to a single academic institution which, in our case, physically spans more than
    one campus. Now, I am wondering if that decision was correct since other
    institutions have expressed some interest in this project.

    Actually, I started out by modeling as:

    Room(.code) is in Building(.name) on Campus(.name)

    However, the choice of reference modes seemed inconsistent with the real world situation. Following
    your example, when I stand outside my office, I see a notice on the door that
    says "161B". However, there is a finite, albeit extremely small, possibility
    that this notice can change for whatever reason, even though the room itself
    does not change. Another problem with higher probability is that the same
    notice may actually appear on doors to rooms in different buildings. Similar
    arguments can be made for Campus(.name), Campus(.id), even University(.name). A
    few years ago, we saw a name change for a state university from Mankato State
    University to Minnesota State University - Mankato.

    The problem of changing names and identifiers is especially true for buildings. We have a situation in the real world that we
    are trying to accurately and consistently represent in out conceptual model.
    There exists a building (or room, or campus, etc.) That building may have one
    or more names. It also has one or more codes/identifiers that appear in various
    documents, such as class schedules, maps, etc. to represent the building. For
    example:

    Building    Name                                                           Identifier

    1               Electrical Engineering/Computer Science   EECsci

    1               Nils Hasselmoe Hall                                    NHH 

    2               Physics                                                        Phys

    2               Tate Laboratory of Physics                          Phys

    Consequently, I began searching for a way to model the distinction between an actual room and
    the notice on its door. One possibility was to use an identifier (i.e., .id).
    Previous work (before the year 2000) indicated that our facilities management
    organization had assigned identifiers to rooms (not the same as the notices on
    the door), buildings and campuses within the University of Minnesota system. While
    I was tempted to use these assignments, after careful reflection, I realized
    that this approach may not be generalized since those in other institutions may
    not have access to the equivalent information. The only alternative I could
    conceive was to use a number (.nr) or system generated identifier (.id) to
    assure uniqueness and to draw the distinction.

    In essence, this is the problem I am trying to model with this fragment. Any suggestions
    would be appreciated. I think at this point that we are discussing more of an
    ORM issue than a NORMA issue.

    Richard



  • 11-28-2012 23:44 In reply to

    Re: Missing Relational View Table?

    Richard,

    Hope this helps.

    Ken

    Data in flat files and spreadsheets
    Even though it is not immediately apparent, such data represents propositions that can be stated in language.  For example:

    Student (.nr) achieved Score ( ) in Test (.code) on Date ()

    So your first task is to study the flat files and spreadsheets and extract the fact types. Then you can put the fact types into NORMA using the CSDP.

    Student Understanding
    I’m having a debate about similar matters in a discussion forum hosted by Professor Dan McFarland at Stanford University. My conclusion is that one must begin with some very fundamental stuff that is rarely if ever challenged by academics and managers at Universities.

    For example, exactly what is “understanding” and how can it be measured? People often say things like “We have to ensure that students understand (insert subject name)”.  But when you question these folks they often define one set of words with another set of words.

    My position on this is that you have to define things by providing a set of binary propositions for which evidence can be provided to test whether the propositions are either true or false. I’ll give an example from my piloting experience:

    The student understands how to perform a steep turn when the following propositions are all true:

    P1: The steep turn starts and ends on the same heading.
    P2: After entering the turn, the aircraft maintains a constant 60 degree angle of bank (plus or minus one degree) until the roll-out manoeuvre starts.
    P3: The aircraft maintains a constant altitude (plus or minus 20 feet) between the start of the turn until the roll-out has been completed.
    P4: Throughout the turn, the aircraft maintains a constant speed (plus or minus 2 knots)  

    In other words, it seems to me that the only way that you can measure “understanding” is by measuring some kind of observable behaviour. (Same thing goes for “knowledge”)

    Generating Reports
    As you say, if you get the schema right then you can generate any kind of report that you like.

     

    Getting insight into how you can improve the education process.
    This is related to the scope of the model (and by inference the scope of the database). You have to begin by defining the metrics (indicators in business speak). This has nothing whatsoever to do with ORM.  Either you know what you have to measure to monitor the performance of the “education process” or you don’t. I’d say that a good place to start is to define what you mean by “the education process”   

    You said: “It is difficult to anticipate questions”
    Maybe I misunderstand you but I must say that I don’t see what the problem is. If you have implemented a database then you can use SQL to ask any question that you like so you don’t have to “anticipate questions”, you just have to be good at writing SQL queries.

    Adding new types of data
    Again I don’t see the problem. If you want to change your schema than the procedure is to begin by changing your object-role model, then you generate a new schema and then you move the existing data from the “old” schema to the “new” schema. Don’t see what “flexibility” has to do with it.

    “Similar, yet different, ways of looking at the UoD”
    Yes it is sometimes the case that you can model a UoD in several different ways. However, the acid test is whether the domain experts agree with the fact types in each different model.  

    Limiting the UoD to a single academic institution
    Well, I just used that as an example of to highlight the relationship between an atomic fact and a given UoD. It was not my intention to “recommend” that you limited the scope of your model to a single University. If you need to model a UoD with multiple Universities then that’s what you have to do (Don’t forget to follow the CSDP)

    “I am wondering if that decision was correct…”
    Well, at some stage you have to make a decision about the scope of the UoD that you want to model. I don’t see that the concept of “correctness” enters into it.   

    “The choice of reference modes seemed inconsistent with the real world…there is a finite, albeit extremely small, possibility that this notice can change for whatever reason, even though the room itself does not change.”
    Again, you just have to make up your mind about what you want to model. According to Dr Graeme Simsion’s work, data modeling is a creative act not a descriptive act. (See the research page)

     “The same notice may actually appear on doors to rooms in different buildings.” Then you model it by defining a fact type that uses Building (.name) and Room (.code) in a way that means that uniqueness is related to the combination of building and room.   (Same for Campus (.name) and University (.name))

    This is all about figuring out the nature of the “atomic” facts in your UoD.

    Name change for a state university
    I can think of two ways to handle this.
    (a) You accept that whatever name is in the database is “true today” and put good procedures in place to update the appropriate columns.
    (b) Add “temporal” fact types such as:
     “University (.nr) has Name () from Date () to Date ()

    The problem of changing names and identifiers is especially true for buildings. You can treat the changes of building names in the same way as I have suggested that you can change University names.


    There exists a building (or room, or campus, etc.) That building may have one or more names. It also has one or more codes/identifiers that appear in various documents, such as class schedules, maps, etc. to represent the building…Searching for a way to model the distinction between an actual room and the notice on its door.
    However you model it, I recommend that you do NOT use an “ID” (i.e. a GUID) because GUID’s are definitely not related to a conceptual model.  

    I recommend that you use an identifier that you can write down on a piece of paper and give to someone with the instruction “Please go to this room” (You could not do that with a GUID)


    Previous work (before the year 2000) indicated that our facilities management organization had assigned identifiers to rooms (not the same as the notices on the door)……
    Yes, it seems to be the case that very few organisations have got a grip on the “data management” problem. Many organisations that I have seen have the “departmentalitis”   disease. In other words, departments behave as though they are independent and assign their own “identifiers” to all sorts of things. This is just a costly way of creating an expensive mess!

  • 11-29-2012 17:30 In reply to

    Re: Missing Relational View Table?

    The one truly stable identifier that a building has is not a name or an artificially-allocated ID, but a location. Buildings don't tend to wander around. You could assign a stable identifier using GPS coordinates, and then add human names and IDs as non-preferred identifiers, and even include time-relevance to indicate current and past usage. It all depends on the cost of using a stable but uncommonly-used identifier, versus the cost of changing an identifier to track common usage.

    Personally I lean towards needing to support identifier change, including propagation to foreign keys. When my databases need to do this, I use delayed FK enforcement if available, otherwise a stored procedure that suspends enforcement, makes changes and re-applies FK enforcement, all in a transaction. The only real difficulty is with the technologies at play, there's no conceptual difficulty here.

    In regard to the identity of a university, we had a change here a decade or so back where many of our institutions were broken into bits and the bits re-assembled into fewer new institutions, mostly with new names. How do you define identity in the face of change like that? I mean, almost none of the tertiary institutions had the same make-up as before, even though some of the names were carried forward. The real world is complicated. We only ever model enough of it to suit our perceived needs

Page 1 of 1 (9 items)
© 2008-2014 The ORM Foundation: A UK not-for-profit organisation -------------- Terms of Service