HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Is there a better way to handle a multi-level ParentId table structure?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
parentidmultihandlelevelwaybetterstructuretheretable

Problem

I work for a publisher, and our products are mainly books and journals. They are most commonly structured as follows:

Book > Chapter

Book Series > Book > Chapter

Book > Volume > Chapter

Book Series > Book > Volume > Chapter

Journal > Volume > Issue > Article

Journal > Volume > Article


We currently have all of these records stored in the same table with Id and ParentId columns. For example, a book with TitleId = 1 that has 3 chapters would have the following rows:

Book: Id = 1, ParentId = 1
Chapter #1: Id = 2, ParentId = 1
Chapter #2: Id = 3, ParentId = 1
Chapter #3: Id = 4, ParentId = 1


All these records, whether they're books, chapters, journals, articles, etc., can have their Id connected to other tables for things like authors, prices, ownership, etc.

The problem this structure creates for us is that the nesting adds tons of overhead in certain situations. For example, if someone tries to access a journal article they purchased, we need to run multiple queries to know if they do, in fact, have access. We have an ownership table containing Id's of owned products, so we not only need to check if the user owns the journal article's Id directly, but we also need to check for ownership of the Id's for the article's parent issue, volume, and journal. (I.e., if the user owns the entire journal, it's implied they own all of the articles within that journal despite not having explicit access.) So our main "ownership" query is rather bulky since we need it to check for ownership at all possible levels of nesting.

There are many other similar situations. For example, if we need to get all the authors associated with a book, we need to run a query for the book directly, then walk down to each chapter and get all the authors for each one, or if the book has volumes, we need to walk down each volume, and get the authors for each volume's chapters.

Another situation is searching, where we need to basically aggregate all these different types to be se

Solution

I would always store the intermediate entities, possibly by adding missing ones as dummy entries, so that you can query using always the same query.

E.g., store

Book > Chapter

as

Book Series (dummy) > Book > Volume (dummy) > Chapter

Now, you can query
SELECT *
FROM
BookSeries bs
LEFT JOIN Book b ON bs.BookSeriesID = b.BookSeriesID
LEFT JOIN Volume v ON b.BookID = v.BookID
LEFT JOIN Chapter c ON v.VolumeID = c.VolumeID


where BookSeries and Volume will yield NULL columns, except for primary and foreign keys.

So, always store starting with BookSeries and end at the desired level. So If you want to store a book with no volume and no chapter, this is fine. The query will yield NULL columns for volumes and chapters (inlcuding for primary and foreign keys). This is ensured by the LEFT JOIN.

Journals have a different structure. Here two approaches are possible:

  • Store journals as independent hierarchy in other tables.



  • (partially) unify the two hierarchies and use a Type or Kind field as a discriminator to differentiate between books and journals



BookSeries > BookOrJournal > Volume +--> Issue > Article
                                    |
                                    \--> Chapter

Code Snippets

BookSeries > BookOrJournal > Volume +--> Issue > Article
                                    |
                                    \--> Chapter

Context

StackExchange Database Administrators Q#292161, answer score: 6

Revisions (0)

No revisions yet.