I thought that Andy's request was quite reasonable. To justify that, I will return to your claims:
Isn't it the case that relational theory breaks down when missing values (nulls) are encountered in a table?
In Marc's case, (see my amended version of Andy's model below), the Claimant's FederalAgency and PublicOrganization are non-mandatory fact types. In the NORMA mapping, we would have a Claimant table with FederalAgencyId and PublicOrganizationId. in Terry's horizontal notation, it looks like this:
Claimant(claimantId, countryCode, hasUnlistedWorkHistory, [federalAgencyId], [publicOrganizationId])
Instead, if we insist on not using NULLs, we can objectify and make independent the two "works for" fact types. That transforms the schema to this:
Claimant(claimantId, countryCode, hasUnlistedWorkHistory)
Now, consider a populated database of a million claimants, a hundred FederalAgencies, and a hundred PublicOrganization, where a quarter of the Claimants work for a FederalAgency and another quarter work for a PublicOrganization. To discover the cost of living for all claimants requires fetching all million records. Because the number of Countries, FederalAgencies and PublicOrganizations is small, we may assume that they require a single disk access each (at most, because they'll stay in cache). The Claimants table in the schema with NULLs is at most 2 megabytes bigger than the version without NULLs - that's assuming the worst case where a NULL is stored as a two-byte indicator rather than as a single bit vector. But the second schema has two additional tables, of a quarter of a million records each. Instead of scanning one table (and joining with cached data), we now must join three tables (and join with the cached tables too). If the results are fetched incrementally, the disk is seeking between three tables, instead of doing a linear scan across a single table - this incurs an order of magnitude performance loss.
If we insist on making the relational view available, but still need the performance advantages of storage using NULLs, the derive the two relations (ClaimantWorksForFederalAgency and ClaimantWorksForPublicOrganization) as views, and ignore the nullable columns (or hide them behind other views). This hiding is what the DBMS tools should be doing for us!
In addition, every time we wish to know where a Claimant works, we must write a 3-way join. It still doesn't get rid of the CASE or 3-way conditional statement, it just makes everything more complicated. The operators relational model doesn't provide a simple solution to avoiding this conditional. The simplest way I can see of doing it is to derive three new relations "ClaimantCostPerTheirHomeCountry", "ClaimantCostPerTheirFederalEmployer" and "ClaimantCostPerTheirPublicOrganization", with a fixed precedence column in each, then union those three into one relation, finally choosing the tuple for each claimant which has the highest precedence.
While that's a nice little theoretical exercise, it's a massively useless solution to a problem that simply doesn't exist. It is the insistence on using this kind of logic that evokes claims of zealotry.
A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines - Ralph Waldo Emerson