patternMinor
Database normalization: Describing trait - foreign key to table, or varchar field with value?
Viewed 0 times
fieldwithforeignvarcharvaluedescribingdatabasetraitnormalizationtable
Problem
My colleagues and I are having a discussion about the normalization of descriptive traits in the database, such as "status" or "type". Let's call the central table of the discussion "Order".
In my regular design-approach, I would define another table, "OrderStatus", to describe the status of an order, and then create a foreign key with a relationship , i.e. "OrderStatusID", on the "Order" table.
This would give me referential integrity. I'd be able to join the status at all times, and my possible values are always present in the "OrderStatus" table.
My colleague doesn't like this degree of normalization, so he'll instead define a varchar field, "OrderStatus" on the "Order" table. This field would contain the values directly.
The possible values of status are defined in his application, more specifically in an Enum of OrderStatuses, and as such, are not available to me unless I have access to the source code of said application.
I'm used to having the entire context of the database exist in the database as relationships and tables, and having to write "WHERE OrderStatus = 'Sold'" as opposed to "WHERE OrderStatusID = 3" bugs me.
What do think? I'm looking for pro's and con's against both approaches, but I'm primarily concerned about performance and readability/maintainability.
In my regular design-approach, I would define another table, "OrderStatus", to describe the status of an order, and then create a foreign key with a relationship , i.e. "OrderStatusID", on the "Order" table.
This would give me referential integrity. I'd be able to join the status at all times, and my possible values are always present in the "OrderStatus" table.
My colleague doesn't like this degree of normalization, so he'll instead define a varchar field, "OrderStatus" on the "Order" table. This field would contain the values directly.
The possible values of status are defined in his application, more specifically in an Enum of OrderStatuses, and as such, are not available to me unless I have access to the source code of said application.
I'm used to having the entire context of the database exist in the database as relationships and tables, and having to write "WHERE OrderStatus = 'Sold'" as opposed to "WHERE OrderStatusID = 3" bugs me.
What do think? I'm looking for pro's and con's against both approaches, but I'm primarily concerned about performance and readability/maintainability.
Solution
You should use a lookup table
You will get a reporting or MIS or Excel app connecting at some point
You will be asked this
Also, see these:
- Not all clients will use the ENUM in the application
You will get a reporting or MIS or Excel app connecting at some point
- How can do "NOT EXIST" otherwise?
You will be asked this
- You won't know about client enum changes
- Strings are inefficient compared to a tinyint, especially when you need to index it for your WHERE clause
- The data and database will outlive the client application
Also, see these:
- "Proper Use of Lookup Tables"
- "INT or CHAR for a Type Field"
- "Advantages and Disadvantages to using ENUM vs Integer types?"
Context
StackExchange Database Administrators Q#11631, answer score: 9
Revisions (0)
No revisions yet.