HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Is a Self-defining Table of Domains Normalized?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
domainsnormalizedtabledefiningself

Problem

I don't really know the official name of what I've made but it can't be the first time this has been made, so apologizes up front for not knowing every term.

My question, is a table as outlined below normalized?

pk    fk    description
1     1     domains
2     1     people
3     1     cars
4     2     tom
5     2     smith
6     3     vmw
7     2     betty
8     3     ford


The main 'rule' this table confuses me about is A table should store only data for a single type of entity. It stores domain information, but it stores multiple domains' information.

A example use for this table, fetch the domain of people.

select * from domains where fk = 2


which returns

pk    fk    description
4     2     tom
5     2     smith
7     2     betty


The fk is the 'domain' the item is in, which is the pk of the domain. This is what to me is self-defining because this table defines which domains it contains.

Solution

This is a called a "self-referencing table", or "single-table hierarchy". You normally query them with a recursive CTE because you want the children, not just on a single conditional.

Note, normally you place the fk of the root as null

CREATE TABLE foo(pk,fk,description)
AS VALUES
    ( 1 , null , 'domains' ),
    ( 2 , 1 , 'people' ),
    ( 3 , 1 , 'cars' ),
    ( 4 , 2 , 'tom' ),
    ( 5 , 2 , 'smith' ),
    ( 6 , 3 , 'vmw' ),
    ( 7 , 2 , 'betty' ),
    ( 8 , 3 , 'ford' );

WITH RECURSIVE t(pk, fk, description, level) AS (
    SELECT pk, fk, ARRAY[description], 0
    FROM foo
    WHERE fk IS NULL
    UNION ALL
        SELECT foo.pk, foo.fk, t.description || foo.description, t.level+1
        FROM t
        JOIN foo ON (foo.fk = t.pk)
)
SELECT *
FROM t;

 pk | fk |      description       | level 
----+----+------------------------+-------
  1 |    | {domains}              |     0
  2 |  1 | {domains,people}       |     1
  3 |  1 | {domains,cars}         |     1
  4 |  2 | {domains,people,tom}   |     2
  5 |  2 | {domains,people,smith} |     2
  6 |  3 | {domains,cars,vmw}     |     2
  7 |  2 | {domains,people,betty} |     2
  8 |  3 | {domains,cars,ford}    |     2
(8 rows)


Now you can find all {domains,cars} like this.

WITH RECURSIVE t(pk, fk, description, level) AS (
     SELECT pk, fk, ARRAY[description], 0
     FROM foo
     WHERE fk IS NULL
     UNION ALL
      SELECT foo.pk, foo.fk, t.description || foo.description, t.level+1
      FROM t
      JOIN foo ON (foo.fk = t.pk)
    )
    SELECT *
    FROM t
    WHERE description @> ARRAY['domains','cars'];
 pk | fk |     description     | level 
----+----+---------------------+-------
  3 |  1 | {domains,cars}      |     1
  6 |  3 | {domains,cars,vmw}  |     2
  8 |  3 | {domains,cars,ford} |     2
(3 rows)


But the real advantage comes with just a bit more complexity.

INSERT INTO foo(pk,fk,description) VALUES
  ( 9, 8, 'Mustang' );


Now rerunning that same query you ge t

pk | fk |         description         | level 
----+----+-----------------------------+-------
  3 |  1 | {domains,cars}              |     1
  6 |  3 | {domains,cars,vmw}          |     2
  8 |  3 | {domains,cars,ford}         |     2
  9 |  8 | {domains,cars,ford,Mustang} |     3
(4 rows)


Mustang is under ford, and it's all in the domain of cars.

You may also want just the last item of the array in the result set, or a trail of pkids. All of those are minor variations of this same pattern utilizing Recursive CTEs.

All examples tested with PostgreSQL

Code Snippets

CREATE TABLE foo(pk,fk,description)
AS VALUES
    ( 1 , null , 'domains' ),
    ( 2 , 1 , 'people' ),
    ( 3 , 1 , 'cars' ),
    ( 4 , 2 , 'tom' ),
    ( 5 , 2 , 'smith' ),
    ( 6 , 3 , 'vmw' ),
    ( 7 , 2 , 'betty' ),
    ( 8 , 3 , 'ford' );

WITH RECURSIVE t(pk, fk, description, level) AS (
    SELECT pk, fk, ARRAY[description], 0
    FROM foo
    WHERE fk IS NULL
    UNION ALL
        SELECT foo.pk, foo.fk, t.description || foo.description, t.level+1
        FROM t
        JOIN foo ON (foo.fk = t.pk)
)
SELECT *
FROM t;

 pk | fk |      description       | level 
----+----+------------------------+-------
  1 |    | {domains}              |     0
  2 |  1 | {domains,people}       |     1
  3 |  1 | {domains,cars}         |     1
  4 |  2 | {domains,people,tom}   |     2
  5 |  2 | {domains,people,smith} |     2
  6 |  3 | {domains,cars,vmw}     |     2
  7 |  2 | {domains,people,betty} |     2
  8 |  3 | {domains,cars,ford}    |     2
(8 rows)
WITH RECURSIVE t(pk, fk, description, level) AS (
     SELECT pk, fk, ARRAY[description], 0
     FROM foo
     WHERE fk IS NULL
     UNION ALL
      SELECT foo.pk, foo.fk, t.description || foo.description, t.level+1
      FROM t
      JOIN foo ON (foo.fk = t.pk)
    )
    SELECT *
    FROM t
    WHERE description @> ARRAY['domains','cars'];
 pk | fk |     description     | level 
----+----+---------------------+-------
  3 |  1 | {domains,cars}      |     1
  6 |  3 | {domains,cars,vmw}  |     2
  8 |  3 | {domains,cars,ford} |     2
(3 rows)
INSERT INTO foo(pk,fk,description) VALUES
  ( 9, 8, 'Mustang' );
pk | fk |         description         | level 
----+----+-----------------------------+-------
  3 |  1 | {domains,cars}              |     1
  6 |  3 | {domains,cars,vmw}          |     2
  8 |  3 | {domains,cars,ford}         |     2
  9 |  8 | {domains,cars,ford,Mustang} |     3
(4 rows)

Context

StackExchange Database Administrators Q#190428, answer score: 4

Revisions (0)

No revisions yet.