HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

indexing on multiple nvarchar(max) columns

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
nvarcharcolumnsindexingmaxmultiple

Problem

I have multiple nvarchar(max) columns in my DB named ShipperName,ConsigneeName, ProdDesc,BillOFLading.

What I need to do is to have UNIQUE INDEX for the combination of these columns. As I am failed to do so because if 900 B limit.

Then I tried(after reading following link Indexing Wide Keys )

ALTER TABLE dbo.ProductDetail 
   ADD ShipperNameHash AS HASHBYTES('SHA2_512', ShipperName) PERSISTED;

ALTER TABLE dbo.ProductDetail 
   ADD ConsigneeNameHash AS HASHBYTES('SHA2_512', ConsigneeName) PERSISTED;


Then I got following error on following statement:


Msg 8152, Level 16, State 10, Line 1

String or binary data would be truncated.

The statement has been terminated.

ALTER TABLE dbo.ProductDetail 
   ADD ProductDescHash AS HASHBYTES('SHA2_512', Product_Description) PERSISTED;


But also above statements created column varbinary(8000). Which of course hasn't solved the problem for me.

Is there any way that I can built UNIQUE INDEX on the combination these columns ?

Solution

I know this is already answered, but I'm going to put in my vote for using HASHBYTES.
I talk a little about it in my post http://michaeljswart.com/2013/11/hashing-for-indexes/ and I describe what that looks like in this case below:

If you can get away with nvarchar(4000) instead of nvarchar(max), then, this schema becomes feasible:

CREATE TABLE ProductDetail
(
    ProductDetailId INT IDENTITY PRIMARY KEY,
    ShipperName NVARCHAR(4000),
    ShipperNameHash as HASHBYTES('SHA1', ShipperName) PERSISTED,
    ConsigneeName  NVARCHAR(4000),
    ConsigneeNameHash  as HASHBYTES('SHA1', ConsigneeName) PERSISTED,
    UNIQUE(ShipperNameHash, ConsigneeNameHash )
)
go


When creating it, SQL Server will warn you about the maximum key length for that unique index being over 900 bytes. But in practice, SHA1 never returns more than 20 bytes, so in this particular case, the warning is ignorable.

See that it works with this insert statement

insert ProductDetail (ShipperName, ConsigneeName)
values 
(
  REPLICATE(N'A', 3999) + N'B', 
  REPLICATE(N'A', 3999) + N'B'
),
(
  REPLICATE(N'A', 3999) + N'C', 
  REPLICATE(N'A', 3999) + N'C'
)


See that it fails with this insert statement:

insert ProductDetail (ShipperName, ConsigneeName)
values 
(
  REPLICATE(N'A', 4000),
  REPLICATE(N'A', 4000)
),
(
  REPLICATE(N'A', 4000),
  REPLICATE(N'A', 4000)
)

Code Snippets

CREATE TABLE ProductDetail
(
    ProductDetailId INT IDENTITY PRIMARY KEY,
    ShipperName NVARCHAR(4000),
    ShipperNameHash as HASHBYTES('SHA1', ShipperName) PERSISTED,
    ConsigneeName  NVARCHAR(4000),
    ConsigneeNameHash  as HASHBYTES('SHA1', ConsigneeName) PERSISTED,
    UNIQUE(ShipperNameHash, ConsigneeNameHash )
)
go
insert ProductDetail (ShipperName, ConsigneeName)
values 
(
  REPLICATE(N'A', 3999) + N'B', 
  REPLICATE(N'A', 3999) + N'B'
),
(
  REPLICATE(N'A', 3999) + N'C', 
  REPLICATE(N'A', 3999) + N'C'
)
insert ProductDetail (ShipperName, ConsigneeName)
values 
(
  REPLICATE(N'A', 4000),
  REPLICATE(N'A', 4000)
),
(
  REPLICATE(N'A', 4000),
  REPLICATE(N'A', 4000)
)

Context

StackExchange Database Administrators Q#59776, answer score: 7

Revisions (0)

No revisions yet.