patternMajor

Entity relationship problem

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

problementityrelationship

Problem

I have 4 tables related like this (it's an example):

Company:
ID
Name
CNPJ

Department:
ID
Name
Code
ID_Company 

Classification:
ID
Name
Code
ID_Company

Workers:
Id 
Name
Code
ID_Classification
ID_Department

Suppose that I have a classification with id = 20, id_company = 1.
And a department that has id_company = 2 (that represents another company).

This will allow the creation of a worker that is from two companies because the classification and department are linked to the company separately. I don't want that to happen, so I think I have a problem with my relationships and I don't know how to solve it.

Solution

I don't think you have a problem with the relationships. I think instead the problem is that by using surrogate keys (ie Ids) for each table the resulting database is unable to prevent Workers from being inserted whose Department is of one Company while the Classification is of another and vice versa. A good way to understand this is to visualize the schema using an ER Diagramming tool. I will use the Oracle Data Modeler tool which is a free download.
ER Diagram

As it stands, you could have 2 companies - say IBM and Microsoft. IBM can have a Software Development department, and Microsoft can have a Desktop Software department. IBM can have a Software Engineer classification, and Microsoft can have a Software Developer classification. Now, because you have a surrogate key for Department and Classification, the fact that Software Development is an IBM department and Desktop Software is a Microsoft department is lost for future child relationships. This is also the case with Classification. Therefore it is easy to accidentally assign Harlan Mills, who is an IBM employee in the Software Development department, a classification of Software Developer which is a Microsoft classification! Likewise, the worker could be given the right classification and wrong department! Here is a diagram showing the first example:

The 1 Ids represent IBM, and the 2 Ids represent Microsoft. I've highlighted in red the scenario where Harlan Mills and Bill Gates are assigned to the wrong departments, which is visualized by the 10 department Id associated to the 200 classification Id and vice versa.
Options to Resolve

So what are the options to prevent his from happening? There are two immediate options. The first is to realize that by using a surrogate key for every table this issue exists and introduce additional programming to verify it doesn't occur. This could be done in the application, but if inserts and updates can occur outside of the application then incorrect associations can still occur. A better approach would be to create a trigger that fires on insert and update of an employee to make sure that the assigned department is of the same company as the assigned classification, and if not fail the insert or update.

The second option is to not use surrogate keys for every table. Instead, use the surrogate keys only for the Company table, which is fundamental and has no parents, and then create identifying relationships to the Department and Classification child tables. The Department and Classification tables now have a PK of the Company Id plus a Sequence Number or Name to distinguish them. Then, the relationships from Department and Classification to Worker also become identifying and thus the PK of Worker becomes the Company Id, plus the Department Number (I'm using a sequence number in this example), plus the Classification Number. The result is there is only one Company Id in the Worker table. It is now impossible to assign a Worker to a Department in one Company and to a Classification in another Company.

Why is this impossible? It is impossible because the schema implements referential integrity between Worker and Department and Classification. If an attempt is made to insert a Worker for a Department in one Company and and a Classification of another, the combination that does not exist in the corresponding parent table will trigger a referential integrity violation and the insert will not work.

Here is an updated diagram of an implementation of the second option:

Preferred Option

Of the two options, I absolutely prefer the second - using the identifying relationships and the cascading keys - for two reasons. First, this option achieves the desired rule with no additional programming. Developing a trigger is not trivial. It must be coded, tested, and maintained. Ensuring the trigger logic is optimal so as not to impact performance is also not trivial. The book Applied Mathematics for Database Professionals provides a lot of detail on the complexity of such a solution. Second, the rules imply that a Department and a Classification can not exist outside the context of the Company, and so the schema now more accurately reflects the real world.

This is a great question because it shows exactly why simply assuming every table requires a surrogate key is a bad idea. Fabian Pascal has an excellent blog post on just this topic showing that not only can a surrogate key be a bad idea from a data integrity standpoint it can also result in making some retrievals slower at the physical level precisely because joins are required that, had keys been properly cascaded, would be unnecessary. Another interesting topic this question reveals is that a database cannot ensure that all data inserted into it is accurate with respect to the real world. Instead, it can only ensure that the data inserte

Context

StackExchange Database Administrators Q#112576, answer score: 25

Revisions (0)

No revisions yet.