snippetMinor
Is it worth it to create an "Id" column?
Viewed 0 times
createworthcolumn
Problem
I have a table which has a column
At first I used it as a Primary Key for this table but I was wondering if it was better to create another column
I know it's a common best practice to have an
What would you advise in this situation?
Name, which is of type VARCHAR2(20), unique, cannot be null, and cannot be changed. At first I used it as a Primary Key for this table but I was wondering if it was better to create another column
Id (with a more appropriate naming of course) and handle table relations with it.I know it's a common best practice to have an
Id column but I've heard that clustering the DB with tons of meaningless Id columns is to be avoided (a Name column has more semantics attached to it).What would you advise in this situation?
Solution
As name is unique and will never change, it is certainly a good candidate key from a relational theory point of view.
You might find using a integer surrogate key preferable for space and performance reasons though as it will take less space than the text in every table that has a foreign key to this table (and every index as FKs are usually indexed columns). Operations to search/join on an integer column will be faster too, though for queries that join in this table and need to output and/or sort by the name you might find the extra work going from id to name removes some of that benefit. Of course the space and performance differences may not be significant enough to make a difference to your project at which point this comes down to preference.
In fact I suspect even using a UUID might be faster than using name, assuming your DBMS has a proper UUID type so they are stored in compact binary form not as a text field, despite the type's fixed 16 byte length probably being longer than the average length of your name values, as comparisons against fixed length binary values are faster than those between variable length strings (it would be interesting to benchmark this and see if the difference is indeed significant) - though unless you have reason to use a UUID (replication issues or such) the smaller integer type will of course be more efficient and take a quarter of the space.
tl;dr: the name column as described is a perfectly good candidate key, perfect in theory in fact, but in practise an integer surrogate key will be more space and processing efficient.
You might find using a integer surrogate key preferable for space and performance reasons though as it will take less space than the text in every table that has a foreign key to this table (and every index as FKs are usually indexed columns). Operations to search/join on an integer column will be faster too, though for queries that join in this table and need to output and/or sort by the name you might find the extra work going from id to name removes some of that benefit. Of course the space and performance differences may not be significant enough to make a difference to your project at which point this comes down to preference.
In fact I suspect even using a UUID might be faster than using name, assuming your DBMS has a proper UUID type so they are stored in compact binary form not as a text field, despite the type's fixed 16 byte length probably being longer than the average length of your name values, as comparisons against fixed length binary values are faster than those between variable length strings (it would be interesting to benchmark this and see if the difference is indeed significant) - though unless you have reason to use a UUID (replication issues or such) the smaller integer type will of course be more efficient and take a quarter of the space.
tl;dr: the name column as described is a perfectly good candidate key, perfect in theory in fact, but in practise an integer surrogate key will be more space and processing efficient.
Context
StackExchange Database Administrators Q#54570, answer score: 6
Revisions (0)
No revisions yet.