patternsqlMinor
No partition elimination for partition column values in a lookup table?
Viewed 0 times
partitioncolumneliminationforvalueslookuptable
Problem
I created a partitioned table (as shown below), and seeded 480 million rows - about 181 rows per account.
I'm running baseline queries before adding indexes. I was surprised to see that doing date lookups on the partition column didn't result in partition elimination even after adding
-
Actual execution plan of query where partition elimination appears to occur.
-
Actual execution plan of query where partitions aren't eliminated.
Eventually, I will add index(es) and post back here if I have questions about that. I don't want to proceed until I'm comfortable with answers given in this post.
```
--step 2 (after creating db)
ALTER DATABASE partitionresearch
ADD FILEGROUP January
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP February
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP March
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP April
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP May
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP June
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP July
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP August
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP September
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP October
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP November
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP December
GO
--step 3
-- Table Partitioning in SQL Server
ALTER DATABASE [Partitionresearch]
ADD FILE
(
NAME = [PartJan],
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartJan.ndf',
SIZE = 5080 KB,
MAXSIZE = UNLIMITED,
FILEGROWTH = 2040 KB
) TO FILEGROUP
I'm running baseline queries before adding indexes. I was surprised to see that doing date lookups on the partition column didn't result in partition elimination even after adding
option(recompile). Is that how it is with partitioned tables? Seems to me that is more like real life than hard coding the predicate's partition column values.-
Actual execution plan of query where partition elimination appears to occur.
-
Actual execution plan of query where partitions aren't eliminated.
Eventually, I will add index(es) and post back here if I have questions about that. I don't want to proceed until I'm comfortable with answers given in this post.
```
--step 2 (after creating db)
ALTER DATABASE partitionresearch
ADD FILEGROUP January
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP February
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP March
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP April
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP May
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP June
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP July
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP August
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP September
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP October
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP November
GO
ALTER DATABASE partitionresearch
ADD FILEGROUP December
GO
--step 3
-- Table Partitioning in SQL Server
ALTER DATABASE [Partitionresearch]
ADD FILE
(
NAME = [PartJan],
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartJan.ndf',
SIZE = 5080 KB,
MAXSIZE = UNLIMITED,
FILEGROWTH = 2040 KB
) TO FILEGROUP
Solution
This isn't available in the product for rowstore partitioned heaps. If you change the table to have a partitioned clustered columnstore index then you will sometimes be able to eliminate partitions via rowgroup elimination by a bitmap filter, which seems to be what you're after.
I blogged about this here. Quoting a small section:
We know that based on the data in the dimension table that SQL Server
only needs to read two partitions from the fact table. Could the query
optimizer in theory do better than it did? Consider the fact that a
partitioned table has at most 15000 partitions. All of the partition
values cannot overlap and they don’t change without a DDL operation.
When building the hash table the query optimizer could keep track of
which partitions have at least one row in them. By the end of the hash
build we’ll know exactly which partitions could contain data, so the
rest of the partitions could be skipped during the probe phase.
Perhaps this isn’t implemented because it’s important for the hash
build to be independent of the probe. Maybe there’s no guarantee
available at the right time that the bitmap operator will be pushed
all the way down to the scan as opposed to a repartition streams
operator. Perhaps this isn’t a common case and the optimization isn’t
worth the effort. After all, how often do you join on the partitioning
column instead of filtering by it?
I blogged about this here. Quoting a small section:
We know that based on the data in the dimension table that SQL Server
only needs to read two partitions from the fact table. Could the query
optimizer in theory do better than it did? Consider the fact that a
partitioned table has at most 15000 partitions. All of the partition
values cannot overlap and they don’t change without a DDL operation.
When building the hash table the query optimizer could keep track of
which partitions have at least one row in them. By the end of the hash
build we’ll know exactly which partitions could contain data, so the
rest of the partitions could be skipped during the probe phase.
Perhaps this isn’t implemented because it’s important for the hash
build to be independent of the probe. Maybe there’s no guarantee
available at the right time that the bitmap operator will be pushed
all the way down to the scan as opposed to a repartition streams
operator. Perhaps this isn’t a common case and the optimization isn’t
worth the effort. After all, how often do you join on the partitioning
column instead of filtering by it?
Context
StackExchange Database Administrators Q#242582, answer score: 7
Revisions (0)
No revisions yet.