HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

3NF Normalization Problem Employing an Algorithm

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
problem3nfalgorithmemployingnormalization

Problem

I'm getting stucked in figuring out how to obtain the third normal form (3NF) using an algorithm ( removing attributes and then functional dependencies ).

The relation R( A,B,C,D,E,F,G,H,I,J,K ) has the following dependencies:

  • A → BEFG,



  • ACDI → H,



  • CD → J,



  • CDF → K,



  • B → G,



  • CF → ABE,



  • AC → D,



  • ABD → C,



  • EG → B



And the result is:

  • R1(B#, G)



  • R2(E#, G#, B)



  • R3(A#, B, E, F)



  • R4(C#, D#, J)



  • R5(A#, C#, I#, H)



  • R6(A, C, D, F, K) with primary key one of: AD or AC or CF.



When I tried to resolve, I came up with:

  • R1(B#, G)



  • R2(E#, G#, B)



  • R3(A#, B, E, F)



  • R4(C#, D#, J)



  • R5(A#, C#, I#, H)



and

  • R6(A#, C#, D )



  • R7(A#, D#, C )



  • R8(C#, F#, A, K )



With is similar but separates the last one.

Why is that the answer?
( I took it from the book ).

Solution

You have not specified the algorithm used for the decomposition, and actually both the decompositions (yours and that of the book) have relations that include others, which is something that normally is not done, since is a form of redundancy.

For instance, having R6(A C D), and R7(A C D) (remember that the order of attributes in a relation schema is not relevant) means that you have two relations with exactly the same attributes (and exactly the same functional dependencies projected from the original ones, AD -> C and AC -> D), so that they have both the same candidate keys (AD and AC), and chosing one of them as primary key in a table and the other one in the other is meaningless.

A similar problem is in the schema R2(E G B), where the projected functional dependencies are EG -> B and B -> G, with candidate keys EB and EG, and it is not clear the advantage of having a separate table R1 with only the attributes B and G, and with the key B.

In fact a frequently used algorithm, presented in almost all books on databases, is the so-called “synthesis” algorithm, that avoids this kind of redundancy, while maintaining both the properties of dependency preservation and lossless-join. This algorithm, applied in this case, produces the following decomposition:

R1(A B E F), with candidate key A

R2(A D H I), with candidate key A D I

R3(A C D), with candidate keys AC and AD

R4(C D J), with candidate key CD

R5(A C F K), with candidate key CF

R6(B E G), with candidate keys EG and EB


Basically this algorithm groups all the dependencies with the same left part, produces a relation for each group with all the attributes of the functional dependencies of the group (and the left part is a candidate key). Then all relation schemas included in others are eliminated (to avoid the kind of redundancy discussed previously), and finally, if no relation schema in the decomposition contains one of the original candidate key, adds a new scheme with one (any one) of the original candidate keys. As said above, this algorithms is guaranted to produce a dependency preserving and lossless-join decomposition.

Code Snippets

R1(A B E F), with candidate key A

R2(A D H I), with candidate key A D I

R3(A C D), with candidate keys AC and AD

R4(C D J), with candidate key CD

R5(A C F K), with candidate key CF

R6(B E G), with candidate keys EG and EB

Context

StackExchange Database Administrators Q#209806, answer score: 3

Revisions (0)

No revisions yet.