patternsqlMinor
Is there a way to allow a `unique` key to be the same on a max of 3 records?
Viewed 0 times
uniquesametherecordswayallowmaxtherekey
Problem
Is there a way to allow a
I have a table with urls in it. I don't want my program to put the same url in more than three records. It has a reason for three, but any more is too much. Any idea how I can make that happen?
unique key to be the same on a max of 3 records?I have a table with urls in it. I don't want my program to put the same url in more than three records. It has a reason for three, but any more is too much. Any idea how I can make that happen?
Solution
Are the URLs coming from a normalized URL table and hence you just have UrlIDs in this table, or is this the source table for URLs? The distinction here is important because you are implying that, in the case of this table being the source of the URLs (which it sounds like it is), you are planning on creating a UNIQUE Constraint or Index on the URL field. Even if they did allow for specifying the number of allowed entries per distinct value, this plan still wouldn't work due to indexes only allowing up to 900 bytes. A lot of URLs are way over that limit, and considering that you should be using
You should create an
In terms of performance, depending on how many rows this table will have over time, it certainly will slow things down a little to have to do this check each time. You should find ways to narrow down a subset of the data for faster searching so that you don't hold up all INSERT and UPDATE queries. Some things to consider:
-
Normalize the URLs out to another table. URLs can be up to 2048 characters (I believe) so if you know that you will generally have repeats, create a separate URL table that will have the unique URLs and just place a UrlID in this table. Then you can easily index the UrlID Foreign Key field. This approach will save tons of disk space, make the trigger much faster, and actually make most queries (especially those that don't need the URL text) against this table faster as you will be fitting many more rows on each data page.
-
If you need to keep the URL in this table, add a UrlHash VARBINARY field and populate that with the result of HASHBYTES. The UrlHash field can also be indexed in which case you could consider adding the Url field to that index as an
This UrlHash field should be a non-Persisted Computed Column (and indexed as noted above). This will ensure that the hash value is always in sync with the source string. The reason for "non"-Persisted is that being
NVARCHAR to store the URLs (since they can have Unicode characters in them, even if that is somewhat rare, hence you should also enable DATA_COMPRESSION if you are using Enterprise Edition and SQL Server 2008 or newer) that is only 450 (generally speaking) characters, which is even more limiting.You should create an
AFTER INSERT, UPDATE trigger to enforce this rule. Using a trigger will allow you to enforce this rule regardless of the source of the INSERT or UPDATE, whereas putting this logic into stored procedures is not guaranteed to catch all attempts to insert or update the table, especially ad hoc queries.In terms of performance, depending on how many rows this table will have over time, it certainly will slow things down a little to have to do this check each time. You should find ways to narrow down a subset of the data for faster searching so that you don't hold up all INSERT and UPDATE queries. Some things to consider:
-
Normalize the URLs out to another table. URLs can be up to 2048 characters (I believe) so if you know that you will generally have repeats, create a separate URL table that will have the unique URLs and just place a UrlID in this table. Then you can easily index the UrlID Foreign Key field. This approach will save tons of disk space, make the trigger much faster, and actually make most queries (especially those that don't need the URL text) against this table faster as you will be fitting many more rows on each data page.
-
If you need to keep the URL in this table, add a UrlHash VARBINARY field and populate that with the result of HASHBYTES. The UrlHash field can also be indexed in which case you could consider adding the Url field to that index as an
INCLUDE column (which doesn't have the 900 byte limit, but also is not sorted and has no statistics). Then your trigger can search on the HASHBYTES value of the incoming value and any entries that match would then compare the full URL text. Yes, you need to compare both in the case where the hash value matches since there can be collisions on the hash values across different source strings. But it would help you easily eliminate non-matches which should be the vast majority of the time.This UrlHash field should be a non-Persisted Computed Column (and indexed as noted above). This will ensure that the hash value is always in sync with the source string. The reason for "non"-Persisted is that being
PERSISTED is not required for that value to be in the index, and the value is only needed for the index, so there is no reason to take up the extra space. Please see the MSDN documentation for Indexes on Computed Columns for more details, if interested.Context
StackExchange Database Administrators Q#115466, answer score: 6
Revisions (0)
No revisions yet.