HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Sparse columns, cpu time & filtered indexes

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
columnstimeindexescpusparsefiltered

Problem

Sparsing

When doing some tests on sparse columns, as you do, there was a performance setback that I would like to know the direct cause of.

DDL

I created two identical tables, one with 4 sparse columns and one with no sparse columns.

--Non Sparse columns table & NC index
CREATE TABLE dbo.nonsparse( ID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
                      charval char(20) NULL,
                      varcharval varchar(20) NULL,
                      intval int NULL,
                      bigintval bigint NULL
                      );
CREATE INDEX IX_Nonsparse_intval_varcharval
ON dbo.nonsparse(intval,varcharval)
INCLUDE(bigintval,charval);

-- sparse columns table & NC index

CREATE TABLE dbo.sparse( ID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
                      charval char(20) SPARSE NULL ,
                      varcharval varchar(20) SPARSE NULL,
                      intval int SPARSE NULL,
                      bigintval bigint SPARSE NULL
                      );

CREATE INDEX IX_sparse_intval_varcharval
ON dbo.sparse(intval,varcharval)
INCLUDE(bigintval,charval);


DML

I then inserted about 2540 NON-NULL values into both.

INSERT INTO dbo.nonsparse WITH(TABLOCK) (charval, varcharval,intval,bigintval)
SELECT 'Val1','Val2',20,19
FROM MASTER..spt_values;

INSERT INTO dbo.sparse WITH(TABLOCK) (charval, varcharval,intval,bigintval)
SELECT 'Val1','Val2',20,19
FROM MASTER..spt_values;


Afterwards, I inserted 1M NULL values into both tables

INSERT INTO dbo.nonsparse WITH(TABLOCK)  (charval, varcharval,intval,bigintval)
SELECT TOP(1000000) NULL,NULL,NULL,NULL 
FROM MASTER..spt_values spt1
CROSS APPLY MASTER..spt_values spt2;

INSERT INTO dbo.sparse WITH(TABLOCK) (charval, varcharval,intval,bigintval)
SELECT TOP(1000000) NULL,NULL,NULL,NULL 
FROM MASTER..spt_values spt1
CROSS APPLY MASTER..spt_values spt2;


Queries

Nonsparse table execution

When running this query twice on the newly created nonsparse table:

```
SET STATISTICS IO

Solution

Or is it simply the behaviour as noted in the documentation?

Seems so. The "overhead" mentioned in the documentation appears to be CPU overhead.

Profiling the two queries, the sparse query sampled 367 ms of CPU, while the non-sparse had 284 ms of CPU. That's a difference of 83 ms.

Where is most of that?

Both profiles look very similar until they get to sqlmin!IndexDataSetSession::GetNextRowValuesInternal. At that point, the sparse code goes down a path that runs sqlmin!IndexDataSetSession::GetDataLong, which calls some functions that look like they relate to the sparse column feature (HasSparseVector, StoreColumnValue), and add up to (42 + 11 =) 53 ms.


Why are these the same size? Was the sparse-ness lost?

Yeah, it appears that the sparse storage optimization does not carry over to nonclustered indexes when the sparse column is used as an index key. So nonclustered index key columns take up their full size regardless of sparseness, but included columns take up zero space if they are sparse and NULL.

Looking at DBCC PAGE output from a clustered index page with NULL-valued sparse columns, I can see that the record length is 11 (4 for the ID + 7 for the standard per-record overhead):

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 11


For the filtered index, the record is always 40, which is the sum of the size of all the key columns (4 byte ID + 20 byte charval + 4 byte varcharval + 4 byte intval + 8 byte big intval = 40 bytes).

For some reason, DBCC PAGE does not include the 7 byte overhead in "Record Size" for index records:

Record Type = INDEX_RECORD          Record Attributes =  NULL_BITMAP    Record Size = 40


The non-filtered index size is smaller (4 byte ID + 4 byte intval + 4 byte varcharval = 12 bytes) because two of the sparse columns are included columns, which again gets the sparseness optimization:

Record Type = INDEX_RECORD          Record Attributes =  NULL_BITMAP    Record Size = 12


I guess this difference in behavior lines up with one of the limitations listed in the docs page:


A sparse column cannot be part of a clustered index or a unique primary key index

They are allowed to be keys in nonclustered indexes, but they are not stored, uh, sparsely.

Code Snippets

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 11
Record Type = INDEX_RECORD          Record Attributes =  NULL_BITMAP    Record Size = 40
Record Type = INDEX_RECORD          Record Attributes =  NULL_BITMAP    Record Size = 12

Context

StackExchange Database Administrators Q#249182, answer score: 7

Revisions (0)

No revisions yet.