HiveBrain v1.2.0
Get Started
← Back to all entries
snippetModerate

How can I losslessly decompose this table?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thiscandecomposelosslesslyhowtable

Problem

I have a table with the following PK (primary key):

child_id   integer
parent_id  integer
date       datetime


child_id and parent_id are FKs (foreign keys) to entity tables. The child table also contains a FK to the parent table, and each child_id references the same parent_id as expected by the table above. Code keeps the two in sync.

To remove redundancy I decompose to the following:

Table_1 PK:
child_id   integer
date       datetime

Table_2 PK:
parent_id  integer
date       datetime

Table_3: (already exists)
child_id   integer PRIMARY KEY
parent_id  integer FOREIGN KEY


When I join these I recover the original table. It's my understanding that makes this 5NF.

However, the dates associated with a given child_id must be a subset of the dates associated with the corresponding parent_id. The original table enforces this rule.

My decomposition does not enforce the rule, because you can add to Table_1 until the dates get too large.

Is this decomposition 5NF? While it permits insertion anomalies, it also follows the Wikipedia example, which itself follows A Simple Guide to Five Normal Forms in Relational Database Theory. The latter's phrase (emphasis mine) "we can reconstruct all the true facts from a normalized form consisting of three separate record types" gives me special pause, since no matter how much garbage I pump into Table_1, the natural join ignores it.

I don't like this decomposition. The practical solution is to leave the table and code as they are. How can I decompose and/or add constraints such that I get away from the original table and preserve my business rule?

Solution

Normalization is based on functional dependencies. Functional dependencies have to do with semantics; they have to do with what the data means. When you simplify a real-world problem to the level of "parent_id, child_id, date", and you don't include any sample data, you really limit how much help a conscientious database designer can give you.

The fact that you have a key {child_id, parent_id, date} in one table, and that you have (it seems) a unique pair {child_id, parent_id} in the child table doesn't necessarily mean that part of the combination is redundant. It might mean that in the table that has {child_id, parent_id, date} as the primary key, the pair of attributes {child_id, parent_id} ought to reference the child table in the first place.

If that's the case, you might use FOREIGN KEY (child_id, parent_id) REFERENCES child (child_id, parent_id). To do that, you need a UNIQUE constraint on the pair of columns (child_id, parent_id) in the table "child", which shouldn't be a problem if child_id is its primary key.

But there's no way to tell without knowing what the data means, and you're the only one in this thread who knows that. (But we'd be happy to let you explain it to us.)

As far as the original table is concerned, you seem to be saying that child_id -> parent_id. If that's the case, why is parent_id in the original table in the first place? Why isn't the key just (child_id, date), with a foreign key reference to the "child" table? It looks to me like the kind of redundancy you're talking about might be resolved by dropping the column "parent_id".

SQL DDL and sample data in the form of INSERT statements helps us help you. DDL and INSERT statements are more precise than descriptions.

Context

StackExchange Database Administrators Q#3542, answer score: 10

Revisions (0)

No revisions yet.