HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Extracting entity from attribute in entity-relationship diagram

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
diagramattributeextractingfromentityrelationship

Problem

I am developing a database model where I need to capture information about rent contracts for parking places within a parking. I have thus defined an entity called PARKING CONTRACT.

Requirements (r.) say, that each PARKING CONTRACT should be signed for exactly 1 car, which is, in turn, owned by exactly 1 car owner.

To fully define a PARKING CONTRACT, r. say, I need to record various information about car owner and his car. For each car owner: full name, address, passport number, etc. For each car: technical passport number, registration number, brand, etc.

According to the r. (they do not go into such details, here I am judging on the basis of attributes that need to be defined!), a car owner can be uniquely identified by their passport number; a car can be identified by registration number.

I first thought it would be good to model CAR OWNER and CAR as separate entities, and came to this:

.-----------------.                      .-------.                 .------------.
| PARKING CONTRACT|     Is signed for    |  CAR  |   Is owned by   |  CAR OWNER |
|                 |O_____________________|       |O________________|            |
'-----------------' 1                   1'-------' 1..*           1'------------'


But then I realized that with this design we may store as much CAR OWNERs as we like without any of them having any PARKING CONTRACTs for any of their CARs. I think it is not good - why a database should be able to store information about people which never had any PARKING CONTRACTs.

Another option I see here is to not extract CARs and CAR OWNERs into separate entities and to keep all information about them in the PARKING CONTRACT table, for each PARKING CONTRACT.

But this option doesn't seem to be too nice as well, as the table for PARKING CONTRACT, imho, becomes a chimera which tries to handle too much different information.

Also, I have read this brochure: https://support.ca.com/cadocs/0/CA%20ERwin%20Data%20Modeler%20r8-ENU/Bookshelf_Files

Solution

Unless you have a good reason not to, you should normalize your database schema. When you normalize your database schema you are avoiding potential issues with data consistency that can occur when you have duplicated data.

Google the terms: insert anomaly, update anomaly, deletion anomaly. You will see lots of examples of the kinds of problems that an unnormalized database can cause.

Regarding modeling CONTRACT separately from CAR: I have found that one of the best places to start with a data model is to treat each tangible (or significant intangible) thing that your system cares about as its own table.

The hard reality is that business rules change, but the things that your business cares about don't change - at least not at nearly the same pace. Therefore, you want your data to reflect the reality of the things your business cares about.

People are not contracts. Sure, today your boss says one contract per person per car. What happens when your boss buys (or builds) a new parking garage and that car has a contract for both? What happens when one person sells a car that has a contract? The contract is signed by a person, not by the car? Does the contract go with the car or does it stay with the owner? What happens when your boss tells you he wants to start tracking future-dated contracts so that a car can have both its current contract and its contract for next year?

You don't have to normalize. You could try to get away with one big table for everything. You could just write everything down in a spreadsheet or on a piece of paper. However, if you want to build a system that is flexible for the future, then start by normalizing.

Regarding IDs for your tables: Never use anything external as the ID for something, unless you are totally positive that it will never change. A vehicle VIN is OK as an identifier. Nothing causes a VIN to change and it is guaranteed to be unique. A passport number is a terrible ID because (a) it will definitely change as time goes on and (b) you can't be sure that it will be unique. When primary keys change it causes all kinds of headaches. This is why a lot of people assign internal, meaningless surrogate keys to their tables.

EDIT: Something Else to Consider:

Something else you should consider is that you should NOT try to model (and maintain data for) things which aren't important to your system.

That is just making work for yourself now and down the road. Your system will be more complicated to build, maintain, upgrade and use.

Consider the following ERD:

The relationship between CAR and OWNER is not important to your system! At first this seems a little bit counterintuitive, but think about this: Does your boss care who actually owns a car? No! What they care about is who (PERSON) signed a contract to pay to have which CAR parked in his garage. Therefore the relationships that are actually important are between CONTRACT and CAR, and CONTRACT and OWNER. The relationship between CAR and OWNER is not directly significant to your system, unless you have a business rule somewhere that says something about how contracts stay with cars, not owners when people sell their cars. That is probably dicey from a legal perspective.

Context

StackExchange Database Administrators Q#30983, answer score: 6

Revisions (0)

No revisions yet.