HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Optimizing numeric range (interval) searches in SQL Server

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
numericsqlrangeintervaloptimizingserversearches

Problem

This question is similar to Optimizing IP Range Search? but that one is restricted to SQL Server 2000.

Suppose I have 10 million ranges provisionally stored in a table structured and populated as below.

CREATE TABLE MyTable
(
Id        INT IDENTITY PRIMARY KEY,
RangeFrom INT NOT NULL,
RangeTo   INT NOT NULL,
CHECK (RangeTo > RangeFrom),
INDEX IX1 (RangeFrom,RangeTo),
INDEX IX2 (RangeTo,RangeFrom)
);

WITH RandomNumbers
     AS (SELECT TOP 10000000 ABS(CRYPT_GEN_RANDOM(4)%100000000) AS Num
         FROM   sys.all_objects o1,
                sys.all_objects o2,
                sys.all_objects o3,
                sys.all_objects o4)
INSERT INTO MyTable
            (RangeFrom,
             RangeTo)
SELECT Num,
       Num + 1 + CRYPT_GEN_RANDOM(1)
FROM   RandomNumbers


I need to know all ranges containing the value 50,000,000. I try the following query

SELECT *
FROM MyTable
WHERE 50000000 BETWEEN RangeFrom AND RangeTo


SQL Server shows that there were 10,951 logical reads and nearly 5 million rows were read to return the 12 matching ones.

Can I improve on this performance? Any restructuring of the table or additional indexes is fine.

Solution

Columnstore is very heplful here compared to a nonclustered index which scans half the table. A nonclustered columnstore index provides most of the benefit but inserting ordered data into a clustered columnstore index is even better.

DROP TABLE IF EXISTS dbo.MyTableCCI;

CREATE TABLE dbo.MyTableCCI
(
Id        INT PRIMARY KEY,
RangeFrom INT NOT NULL,
RangeTo   INT NOT NULL,
CHECK (RangeTo > RangeFrom),
INDEX CCI CLUSTERED COLUMNSTORE
);

INSERT INTO dbo.MyTableCCI
SELECT TOP (987654321) *
FROM dbo.MyTable
ORDER BY RangeFrom ASC
OPTION (MAXDOP 1);


By design I can get rowgroup elimination on the RangeFrom column which will eliminate half of my rowgroups. But due to the nature of the data I also get rowgroup elimination on the RangeTo column as well:

Table 'MyTableCCI'. Segment reads 1, segment skipped 9.


For larger tables with more variable data there are different ways to load the data to guarantee the best possible rowgroup elimination on both columns. For your data in particular, the query takes 1 ms.

Code Snippets

DROP TABLE IF EXISTS dbo.MyTableCCI;

CREATE TABLE dbo.MyTableCCI
(
Id        INT PRIMARY KEY,
RangeFrom INT NOT NULL,
RangeTo   INT NOT NULL,
CHECK (RangeTo > RangeFrom),
INDEX CCI CLUSTERED COLUMNSTORE
);

INSERT INTO dbo.MyTableCCI
SELECT TOP (987654321) *
FROM dbo.MyTable
ORDER BY RangeFrom ASC
OPTION (MAXDOP 1);
Table 'MyTableCCI'. Segment reads 1, segment skipped 9.

Context

StackExchange Database Administrators Q#225953, answer score: 12

Revisions (0)

No revisions yet.