HiveBrain v1.2.0
Get Started
← Back to all entries
principlesqlModerate

SQL Server nvarchar(max) vs nvarchar(n) affects performance

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
nvarcharsqlaffectsmaxperformanceserver

Problem

This is SQL Server 2008 R2 SP2. I have 2 tables. Both are identical (data and indexing), except the first table has a VALUE column as nvarchar(max) and the second has the same column as nvarchar(800). This column is included in a non-clustered index. I also created a clustered index on both tables. I have also rebuilt the indexes. The max string length in this column is 650.

If I run the same query against both the nvarchar(800) table is consistently faster, many times twice as fast. Sure seems like it is defeating the purpose of "varchar". Table contains 800,000+ rows. The query should be looking at around 110,000 rows (which is what the plan estimates).

According to the io stats there are no lob reads, so everything appears to be in row. Execution plans are the same, except there is a slight difference in the cost percentage between the two tables and the estimated row size is bigger with the nvarchar(max) (91 bytes vs 63 bytes). The number of reads are pretty much the same too.

Why the difference?

===== Schema ======

```
CREATE TABLE [dbo].table1 NOT NULL,
[ProductID] [bigint] NOT NULL,
[ProductSkeletonID] [bigint] NOT NULL,
[Value] nvarchar NOT NULL,
[IsKeywordSearchable] [bit] NULL,
[ValueInteger] [bigint] NULL,
[ValueDecimal] decimal NULL,
[ValueDate] [datetime] NULL,
[TypeOfData] nvarchar NOT NULL,
CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IX_table1_productskeletonid] ON [dbo].[table1]
(
[ProductSkeletonID] ASC
)
INCLUDE ( [ProductID],
[Value]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY =

Solution

You're seeing the cost overhead of using MAX types.

While NVARCHAR(MAX) is identical to NVARCHAR(n) in TSQL and can be stored in-row, it is handled separately by the storage engine because it can be pushed off-row. When off-row it is a LOB_DATA allocation unit, rather than ROW_OVERFLOW_DATA allocation unit and we can assume from your observations that this carries an overhead.

You can see the two types are internally stored differently with a little DBCC PAGE spelunking. Mark Rasmussen posted example page dumps that show the differences in What is the Size of the LOB Pointer for (MAX) Types Like Varchar, Varbinary, Etc?

We can probably assume it's the GROUP BY on the MAX column that causes the performance difference in your case. I've not tested other operations on a MAX type but it might be interesting to do so and see if similar results are seen.

Context

StackExchange Database Administrators Q#47910, answer score: 18

Revisions (0)

No revisions yet.