snippetsqlMinor
Should I convert the frontend generated UUIDv6 to binary(16) for use in SQL Server as the clustered primary key?
Viewed 0 times
uuidv6generatedtheclusteredprimarykeyconvertsqlbinaryfor
Problem
Background
As suggested by the front-end developer, I looked into using UUID as the primary key for a bunch of tables in our new system. From learning the pros and cons of random vs. sequential UUIDs, to the use of a non-clustered primary key in combination of a clustered index with a sort-able type, my research pointed me to UUIDv6, and an implementation of it.
It is able to generate UUIDs like the below (that is sequential):
UUIDv1 UUIDv6
------------------------------------ ------------------------------------
5714f720-1268-11e7-a24b-96d95aa38c32 1e712685-714f-6720-a23a-c90103f70be6
68f820c0-1268-11e7-a24b-96d95aa38c32 1e712686-8f82-60c0-ac07-7d6641ed230d
7ada38f0-1268-11e7-a24b-96d95aa38c32 1e712687-ada3-68f0-93f8-c1ebf8e6fc8c
8cc06fd0-1268-11e7-a24b-96d95aa38c32 1e712688-cc06-6fd0-a828-671acd892c6a
9ea6a6b0-1268-11e7-a24b-96d95aa38c32 1e712689-ea6a-66b0-910c-dbcdb07df7a4
Which I thought SQL Server would gladly sort them for me in the clustered primary key (uniqueidentifier) column.
Little did I know how SQL Server would sort an uniqueidentifier column. Here's the ascending sort result:
UUIDv6 uniqueidentifier sorted
------------------------------------
1e712688-cc06-6fd0-a828-671acd892c6a
1e712686-8f82-60c0-ac07-7d6641ed230d
1e712687-ada3-68f0-93f8-c1ebf8e6fc8c
1e712685-714f-6720-a23a-c90103f70be6
1e712689-ea6a-66b0-910c-dbcdb07df7a4
Which is causing fragmentation as if using random UUIDs. This post explains how they were actually sorted.
The real question
Luckily, the system is still in development. Which of these options should I go for next?
UUIDv6 UUIDv6 reordered bytes
------------------------------------ ------------------------------------
1e712685-714f-6720-a23a-c90103f70be6 c90103f7-0be6-a23a-6720-1e712685714f
1e712686-8f82-60c0-ac07-7d6641ed230d 7d6641ed-230d-ac07-60c0
As suggested by the front-end developer, I looked into using UUID as the primary key for a bunch of tables in our new system. From learning the pros and cons of random vs. sequential UUIDs, to the use of a non-clustered primary key in combination of a clustered index with a sort-able type, my research pointed me to UUIDv6, and an implementation of it.
It is able to generate UUIDs like the below (that is sequential):
UUIDv1 UUIDv6
------------------------------------ ------------------------------------
5714f720-1268-11e7-a24b-96d95aa38c32 1e712685-714f-6720-a23a-c90103f70be6
68f820c0-1268-11e7-a24b-96d95aa38c32 1e712686-8f82-60c0-ac07-7d6641ed230d
7ada38f0-1268-11e7-a24b-96d95aa38c32 1e712687-ada3-68f0-93f8-c1ebf8e6fc8c
8cc06fd0-1268-11e7-a24b-96d95aa38c32 1e712688-cc06-6fd0-a828-671acd892c6a
9ea6a6b0-1268-11e7-a24b-96d95aa38c32 1e712689-ea6a-66b0-910c-dbcdb07df7a4
Which I thought SQL Server would gladly sort them for me in the clustered primary key (uniqueidentifier) column.
Little did I know how SQL Server would sort an uniqueidentifier column. Here's the ascending sort result:
UUIDv6 uniqueidentifier sorted
------------------------------------
1e712688-cc06-6fd0-a828-671acd892c6a
1e712686-8f82-60c0-ac07-7d6641ed230d
1e712687-ada3-68f0-93f8-c1ebf8e6fc8c
1e712685-714f-6720-a23a-c90103f70be6
1e712689-ea6a-66b0-910c-dbcdb07df7a4
Which is causing fragmentation as if using random UUIDs. This post explains how they were actually sorted.
The real question
Luckily, the system is still in development. Which of these options should I go for next?
- reorder the bytes so that the most/least significant bytes are where SQL Server expects them to be
UUIDv6 UUIDv6 reordered bytes
------------------------------------ ------------------------------------
1e712685-714f-6720-a23a-c90103f70be6 c90103f7-0be6-a23a-6720-1e712685714f
1e712686-8f82-60c0-ac07-7d6641ed230d 7d6641ed-230d-ac07-60c0
Solution
I would consider another option: cluster on a surrogate key (like
This avoids the problems you called out with options 1 and 2, because you don't have to worry about the sorting / fragmentation issues in your base table, or potentially esoteric issues with clustering on
It will also save you some space (int is smaller than
Id int IDENTITY(1,1) NOT NULL), and make the app-generated UUID a nonclustered primary key.This avoids the problems you called out with options 1 and 2, because you don't have to worry about the sorting / fragmentation issues in your base table, or potentially esoteric issues with clustering on
binary.It will also save you some space (int is smaller than
uniqueidentifier and binary(16)), as the clustering key is included in every nonclustered index (as well as other tables that reference this table via foreign key).Context
StackExchange Database Administrators Q#274919, answer score: 6
Revisions (0)
No revisions yet.