patternModerate
Are all databases reducible to this ultimate abstract database design?
Viewed 0 times
thisabstractdatabasesallaredesignreducibledatabaseultimate
Problem
I've designed a few databases in my time, and on more than one occasion the drive to abstract common elements from specific tables has led me to create generic top-level tables which contain those common elements. For example:
Could be "simplified" ("normalized") as:
Recently I've gone over the deep end with this approach, abstracting and re-abstracting a fairly complex database design until I was left with something both very simple and yet completely un-resembling of the actual data being stored.
This has led me to the conclusion that all databases could be "summarized" in a single monstrous table called "Entries" with columns:
For example:
Here, using these 4 columns in 1 table, I have created two objects sharing a common superclass, given them an attribute, and defined not only a relationship between them but the class of that relationship as well. We could now say "Lettuce is a Topping of Hamburger, both of which are Foods".
There would of course be a set of rules for this system, but that is beyond the scope of this question.
My question is, is this not logically the c
Table Column Column
Hamburgers Item Topping
Cheeseburger Tomatoes
Mushroomburger SwissCould be "simplified" ("normalized") as:
Table Column Column
FoodTypes ID Name
1 Hamburger
2 Topping
Food Item TypeID
Cheeseburger 1
Mushroomburger 1
Tomatoes 2
Swiss 2Recently I've gone over the deep end with this approach, abstracting and re-abstracting a fairly complex database design until I was left with something both very simple and yet completely un-resembling of the actual data being stored.
This has led me to the conclusion that all databases could be "summarized" in a single monstrous table called "Entries" with columns:
ID Type Value1 Value2For example:
ID Type Value1 Value2
4321 Item
8746 Descrip 4321 Food
5673 Item
9876 Descrip 5673 Hamburger
0341 Item
1234 Descrip 0341 Lettuce
5478 Relation 5673 0341
2381 Descrip 5478 Topping
2244 Relation 5673 4321
2160 Descrip 2244 Class
4436 Relation 0341 4321
7547 Descrip 4436 ClassHere, using these 4 columns in 1 table, I have created two objects sharing a common superclass, given them an attribute, and defined not only a relationship between them but the class of that relationship as well. We could now say "Lettuce is a Topping of Hamburger, both of which are Foods".
There would of course be a set of rules for this system, but that is beyond the scope of this question.
My question is, is this not logically the c
Solution
This insight was made at least as far back as 1886, by Alfred Kempe. In fact, he observed that any set of relationships (not just those that are already relations) can be captured as a binary relation. At the time this seems to have come as a shock to Charles Sanders Peirce, who had built his relation algebra on the idea that any relation can be captured by means of ternary relations, and who up to this time apparently believed this to be the best possible. Kempe used graphs to illustrate his notion, while Peirce used edge-labelled graphs. In a long and admiring review Peirce revised his theories to accommodate binary relations as a possible foundation, and distinguished the two approaches:
Mr. Kempe's conception depends upon considering the diagram purely in its self-contained relations, the idea of its representing anything being altogether left out of view; while my doctrine depends upon considering how the diagram is to be connected with nature.
In terms of modern database representations, RDF corresponds to Peirce's ternary relations, while key-value stores correspond to Kempe's graphs. As Raphael pointed out in a comment, these are essentially the ideas behind graph databases.
As an aside, the arguments about binary/ternary/quaternary/quinternary/... tuples being the "best" foundation for data representation still rage on in the W3C semantic web working groups. I believe that the original insight of the relational model was that for efficiency, one should use relations of the right arity; relationships that are naturally represented as tuples should not be transmogrified into more complicated data structures just for reasons of purity. However, this argument also goes the other way: relationships where different instances naturally have different arities should not be shoehorned into a rigid relational schema just to preserve the purity of the data model.
Mr. Kempe's conception depends upon considering the diagram purely in its self-contained relations, the idea of its representing anything being altogether left out of view; while my doctrine depends upon considering how the diagram is to be connected with nature.
In terms of modern database representations, RDF corresponds to Peirce's ternary relations, while key-value stores correspond to Kempe's graphs. As Raphael pointed out in a comment, these are essentially the ideas behind graph databases.
As an aside, the arguments about binary/ternary/quaternary/quinternary/... tuples being the "best" foundation for data representation still rage on in the W3C semantic web working groups. I believe that the original insight of the relational model was that for efficiency, one should use relations of the right arity; relationships that are naturally represented as tuples should not be transmogrified into more complicated data structures just for reasons of purity. However, this argument also goes the other way: relationships where different instances naturally have different arities should not be shoehorned into a rigid relational schema just to preserve the purity of the data model.
- A. B. Kempe, A Memoir on the Theory of Mathematical Form, Philosophical Transactions of the Royal Society of London 177, 1886, 1–70.
- C. S. Peirce, The Critic of Arguments (1892 essay), in Collected Papers of Charles Sanders Peirce, Vol. III, Harvard University Press, 1933, 250–265.
Context
StackExchange Computer Science Q#12060, answer score: 11
Revisions (0)
No revisions yet.