patternsqlModerate
Optimizing numeric range (interval) searches in SQL Server
Viewed 0 times
numericsqlrangeintervaloptimizingserversearches
Problem
This question is similar to Optimizing IP Range Search? but that one is restricted to SQL Server 2000.
Suppose I have 10 million ranges provisionally stored in a table structured and populated as below.
I need to know all ranges containing the value
SQL Server shows that there were 10,951 logical reads and nearly 5 million rows were read to return the 12 matching ones.
Can I improve on this performance? Any restructuring of the table or additional indexes is fine.
Suppose I have 10 million ranges provisionally stored in a table structured and populated as below.
CREATE TABLE MyTable
(
Id INT IDENTITY PRIMARY KEY,
RangeFrom INT NOT NULL,
RangeTo INT NOT NULL,
CHECK (RangeTo > RangeFrom),
INDEX IX1 (RangeFrom,RangeTo),
INDEX IX2 (RangeTo,RangeFrom)
);
WITH RandomNumbers
AS (SELECT TOP 10000000 ABS(CRYPT_GEN_RANDOM(4)%100000000) AS Num
FROM sys.all_objects o1,
sys.all_objects o2,
sys.all_objects o3,
sys.all_objects o4)
INSERT INTO MyTable
(RangeFrom,
RangeTo)
SELECT Num,
Num + 1 + CRYPT_GEN_RANDOM(1)
FROM RandomNumbersI need to know all ranges containing the value
50,000,000. I try the following query SELECT *
FROM MyTable
WHERE 50000000 BETWEEN RangeFrom AND RangeToSQL Server shows that there were 10,951 logical reads and nearly 5 million rows were read to return the 12 matching ones.
Can I improve on this performance? Any restructuring of the table or additional indexes is fine.
Solution
Columnstore is very heplful here compared to a nonclustered index which scans half the table. A nonclustered columnstore index provides most of the benefit but inserting ordered data into a clustered columnstore index is even better.
By design I can get rowgroup elimination on the
For larger tables with more variable data there are different ways to load the data to guarantee the best possible rowgroup elimination on both columns. For your data in particular, the query takes 1 ms.
DROP TABLE IF EXISTS dbo.MyTableCCI;
CREATE TABLE dbo.MyTableCCI
(
Id INT PRIMARY KEY,
RangeFrom INT NOT NULL,
RangeTo INT NOT NULL,
CHECK (RangeTo > RangeFrom),
INDEX CCI CLUSTERED COLUMNSTORE
);
INSERT INTO dbo.MyTableCCI
SELECT TOP (987654321) *
FROM dbo.MyTable
ORDER BY RangeFrom ASC
OPTION (MAXDOP 1);By design I can get rowgroup elimination on the
RangeFrom column which will eliminate half of my rowgroups. But due to the nature of the data I also get rowgroup elimination on the RangeTo column as well:Table 'MyTableCCI'. Segment reads 1, segment skipped 9.For larger tables with more variable data there are different ways to load the data to guarantee the best possible rowgroup elimination on both columns. For your data in particular, the query takes 1 ms.
Code Snippets
DROP TABLE IF EXISTS dbo.MyTableCCI;
CREATE TABLE dbo.MyTableCCI
(
Id INT PRIMARY KEY,
RangeFrom INT NOT NULL,
RangeTo INT NOT NULL,
CHECK (RangeTo > RangeFrom),
INDEX CCI CLUSTERED COLUMNSTORE
);
INSERT INTO dbo.MyTableCCI
SELECT TOP (987654321) *
FROM dbo.MyTable
ORDER BY RangeFrom ASC
OPTION (MAXDOP 1);Table 'MyTableCCI'. Segment reads 1, segment skipped 9.Context
StackExchange Database Administrators Q#225953, answer score: 12
Revisions (0)
No revisions yet.