patternModerate
Choosing primary keys: Scientific names of species or system-assigned numeric identifiers?
Viewed 0 times
primarynumericscientificsystemnameskeysassignedidentifierschoosingspecies
Problem
Scientific names are well accepted unique identifiers of species. However, scientific names have special characters such as spaces and dots. Can it be a good unique identifier for databasing or should one assign/implement unique numerical identifiers?
If it is with the latter, is there a globally known numeric identifier? Or should one assign own identifiers for the rows?
If it is with the latter, is there a globally known numeric identifier? Or should one assign own identifiers for the rows?
Solution
I would use my own identifier.
The species name may be unique but
For example, in SQL Server, if used as the clustered primary key it will be used in non-clustered indexes, thus repeating the long string. And typically foreign keys omn child tables will go to the primary key, thus repeating it again
As a string, you have the overhead of sorting and comparison (case, accents, etc).
Using a surrogate numeric key avoids these problems: but you must create a unique non-clustered index on the species name.
Is species name a good long term identifier too? Not my area of expertise but don't many species have alternate names, or controversies, or get reclassified, or "maybe this species"?
Example: how many Giraffe species are there? 9? 2? 6? 8? 4?
The species name may be unique but
- it's too long
- it's a string
For example, in SQL Server, if used as the clustered primary key it will be used in non-clustered indexes, thus repeating the long string. And typically foreign keys omn child tables will go to the primary key, thus repeating it again
As a string, you have the overhead of sorting and comparison (case, accents, etc).
Using a surrogate numeric key avoids these problems: but you must create a unique non-clustered index on the species name.
Is species name a good long term identifier too? Not my area of expertise but don't many species have alternate names, or controversies, or get reclassified, or "maybe this species"?
Example: how many Giraffe species are there? 9? 2? 6? 8? 4?
Context
StackExchange Database Administrators Q#197363, answer score: 12
Revisions (0)
No revisions yet.