patternsqlModerate
Partitioning a large table has not improved performance, why?
Viewed 0 times
partitioningwhyimprovedhaslargeperformancenottable
Problem
In SQL Server 2014 I have partitioned one of my large tables weekly and defined a sliding window scenario to switch the oldest week's data to the archive DB and create a new partition for the next week.
This is the result:
This is for an AVL System (Vehicle Tracking). I have partitioned on PositionDate (datetime).
All our queries have PositionDate in the WHERE clause and in many cases we have VehicleId in the WHERE clause too. So I created two aligned indexes on VehicleId (int):
But in every query that contains VehicleId in its WHERE clause neither of these two non-clustered indexes is used (according to the query plan).
I have a performance problem now.
I compared the query plans between partitioned and non-partitioned table for queries like below:
and awesomely I see that the first query costs 30% but the second, 70%.
I have one file group with two files for the partitioned table.
My questions:
-
Is the number of rows in each partition more than the optimal number of rows for partitioning? If I partition by day and hold last 60 days' data live, will that help me improve performance?
-
Are my non-clustered indexes well defined or I should remove them? We have PositionDate in the WHERE clause of all queries and VehicleId in many of them.
-
Am I misusing partitioning for this scenario? If I define good indexes on my non-partitioned table and move oldest data (more than 2 month old) to the archive table, will that work well for my case?
DDL for my indexes:
```
ALTER TABLE [dbo].[MyTable] ADD CONSTRAINT [PK_Primary] PRIMARY KEY CLUSTERED
(
[PositionDate] ASC,
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS =
This is the result:
This is for an AVL System (Vehicle Tracking). I have partitioned on PositionDate (datetime).
All our queries have PositionDate in the WHERE clause and in many cases we have VehicleId in the WHERE clause too. So I created two aligned indexes on VehicleId (int):
- one on (PositionDate,VehicleId);
- one on just (VehicleId).
But in every query that contains VehicleId in its WHERE clause neither of these two non-clustered indexes is used (according to the query plan).
I have a performance problem now.
I compared the query plans between partitioned and non-partitioned table for queries like below:
Select * from MyNonPart_Table Where PositionDate between '2016-05-01' AND '2016-06-01'
Select * from PartitinedTable Where PositionDate between '2016-05-01' AND '2016-06-01'and awesomely I see that the first query costs 30% but the second, 70%.
I have one file group with two files for the partitioned table.
My questions:
-
Is the number of rows in each partition more than the optimal number of rows for partitioning? If I partition by day and hold last 60 days' data live, will that help me improve performance?
-
Are my non-clustered indexes well defined or I should remove them? We have PositionDate in the WHERE clause of all queries and VehicleId in many of them.
-
Am I misusing partitioning for this scenario? If I define good indexes on my non-partitioned table and move oldest data (more than 2 month old) to the archive table, will that work well for my case?
DDL for my indexes:
```
ALTER TABLE [dbo].[MyTable] ADD CONSTRAINT [PK_Primary] PRIMARY KEY CLUSTERED
(
[PositionDate] ASC,
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS =
Solution
You generally followed a delusion - that partitioning will give you a significant performance boost for queries compared to a standard index.
This is not the case. There is little difference between filtering with an index and with a partition.
Partitions are not there to make your queries faster, but to allow fast DELETE - by swapping out a partition with an empty version of the table. This allows "truncate" style performance for deleting parts of a table - and that is significant. Brutally significant, if you ever experience the long time it may take to delete dozens of gigabytes of data.
There are also insert scenarios where a partition can help.
But for queries - no, the partition will not be better than proper indices. In fact, it will be slower - as there the work is more complex (management of which partitions to access).
This is not the case. There is little difference between filtering with an index and with a partition.
Partitions are not there to make your queries faster, but to allow fast DELETE - by swapping out a partition with an empty version of the table. This allows "truncate" style performance for deleting parts of a table - and that is significant. Brutally significant, if you ever experience the long time it may take to delete dozens of gigabytes of data.
There are also insert scenarios where a partition can help.
But for queries - no, the partition will not be better than proper indices. In fact, it will be slower - as there the work is more complex (management of which partitions to access).
Context
StackExchange Database Administrators Q#141175, answer score: 19
Revisions (0)
No revisions yet.