Sample Population Data Management

Last post Wed, Jun 11 2008 20:11 by Matthew Curland. 1 replies.

Page 1 of 1 (2 items)
	Sort Posts: Previous Next

Wed, Jun 4 2008 18:07

Brian Nalewajek
Joined on Sat, Mar 8 2008
Connecticut
Posts 314

Sample Population Data Management

Reply Contact

Sample Population Data Management

The ability to verify Fact Types with Fact Instances, using sample population data is an important part of the ORM methodology. How should an ORM tool, like nORMa, make the process of adding and managing sample data more efficient and effective?

During the tool development process, different areas, features and aspects of the tool have been given greater or lesser priority. It seems that Sample Population management has been given a lower priority than other tool features. That's understandable. Eventually, a complete tool will need a better system for handling Sample Population Data.

Though a number of improvements have been included in various CTPs, there's still much that can be made better. A number of requests have been put forward to make it easier to tab through value entry boxes, etc.... While simple improvements like these would be welcome, are there more important considerations for SPD management?

Consider the current way that the tool flags errors, where SPD conflicts with a Fact Type constraint:

Employee(.id) has FirstName() [with an IUC over Role1]

Populating this with sample data gives an error result if there was a violation (as it should), and no error if the constraint is not violated. Adding:

Employee has LastName [with a mandatory and IUC on Role1]

If you do not enter SPD for this Fact Type, there will as many error messages as you had sample instances in the first FT population - as the mandatory constraint was not satisfied for these. The tool correctly adds the error messages for the violations; but is this useful? I don't think the intent is to force an "all or nothing" use of sample data, any more than it is to flood the errors window. How should the SPD feature and the error messaging feature work together? Is the answer to use customization to filter the error reports, or change the way the SPD feature works?

If you have thoughts on this or other aspects of the SPD feature of the tool, please add them to this thread. Do you use the feature? If not, what would it need to become useful to you? When the time comes to focus on the Sample Population Data management features of the tool, the team will have your input to guide them.

BRN..

Wed, Jun 11 2008 20:11 In reply to

Matthew Curland
Joined on Sat, Mar 8 2008
Posts 450

Re: Sample Population Data Management

Reply Contact

A few thoughts on the area:

Error display is currently an all/nothing for any errors. There are actually four places that errors are displayed (Error List, Model Browser, Diagram, and the diagram context menu). Being able to pick/choose among these (and possibly future) display targets might alleviate some of the clutter. If you don't know where to do this, click on the diagram background, open the Properties Window, and modify the 'ErrorDisplay' property. You can turn off display of both individual errors and categories of errors.
It might be nice to offer more control over the error categories from the point where the error is displayed. For example, a command to 'not display any errors of this type' and another 'Display hidden errors of type' command could be provided. This would mean that the error instance/type of error association would be significantly more discoverable.
These are real errors, and we plan to keep them as such at the core model, even if the user chooses not to display them. The goals go well beyond the sample population of individual FactTypes, and we think that the non-isolated FactType populations are more powerful. A few things that are on the future wish list are below. Note that these take the sample validation well beyond the current level, but you would naturally have to request a higher validation and/or display level. The goal is to use the populations to fully validate the model and to produce test cases that verify all constraint enforcement.

Support multi-fact population, including populations of entire table rows from relational and other views. This would reduce the number of items you would need to select to populate multiple FactTypes.
Populate and validate based on constraint selection
Provide official counter examples (POPA is a counter example to POPB because of CONSTRAINTC).
Be able to validate sufficient population for initial DB population covering all constraint cases
Be able to validate sufficient counter example population to generate negative test cases for all constraint

We will provide more error display in the population editor itself. We actually had red backgrounds at one point, but they were never checked in because of a VirtualTreeGrid bug (blank cells, such as the area beside a comple expansion, was not reseting the background color, so you got red splashed in random places in the grid). However, it is relatively low-hanging fruit to show error glyphs in the population grid. We could also show other glyphs to indicate the type of value, which would provide more graphical indications as to what to except in terms of editing capabilities when the cell is activated.
We already do some error activation on these, so a double-click in the Error List or on any red-hashed element will take the first steps towards fixing the problem. Obviously, we can't do this in all cases (which FactType do you populate for a disjunctive mandatory?), but we do it when we can.

Of course, in all of these we need to balance the need to not overwhelm the user with too much information and immediate requirements, which is what I believe you're feeling know. The ability to both verify and test with sample population data is very important as we've found that many modelers spend significantly more time testing the generated output than creating the model. The more of this we can do automatically the more valuable of a tool we'll have. I'd recommend just turning off display for the 'Missing Mandatory Sample Population' error to limit your error display to issues within a single FactType.

-Matt

Page 1 of 1 (2 items)

The ORM Foundation