patternsqlMinor
Expanding a partitioned table
Viewed 0 times
tablepartitionedexpanding
Problem
I have a table that holds four years of data, where for each month I have a partition that is based on a filegroup with two files. Unexpectedly, I received eight more years of data and prepared the database files and file groups for these years.
The current scheme in place covers 200901 - 201512 and I now need to add 200001 - 200812. What is the best way to bring this online for the partitioned table?
I'm asking because the current table has 18 billion rows and I want to do this the efficient way :)
The current scheme in place covers 200901 - 201512 and I now need to add 200001 - 200812. What is the best way to bring this online for the partitioned table?
- Create a new partition scheme and partition function and change the clustered index to this new partition scheme; or
- Can I somehow alter the existing partition scheme and partition function?
I'm asking because the current table has 18 billion rows and I want to do this the efficient way :)
Solution
Here are two best practices for partitioning that pertain to the question:
of the partition range to ensure that the partitions split when
loading in new data, and merge, after unloading old data, do not
cause data movement.
because this can cause severe locking and explosive log growth.
http://www.informit.com/articles/article.aspx?p=1946159&seqNum=5
If the leftmost end of your partition is empty, use ALTER PARTITION FUNCTION SPLIT RANGE to add new ranges to the partition function.
To check if the leftmost partition is empty, use a query like the following:
If the first partition is not empty, the best practices recommend that you create a new function with all values, create a new table on that function, then insert the data to the new table.
Also, if the left partition just has a few records, a split may be fine. Not sure on that as I've never tried it.
Whatever you do, make sure to leave some empty partitions at the leftmost and rightmost partition when you're finished. I might even go so far as to creating partition ranges for 0 and 1, then add a check constraint to prevent the first partition from getting data in it. Do the same thing for the end.
- Keep an empty staging partition at the leftmost and rightmost ends
of the partition range to ensure that the partitions split when
loading in new data, and merge, after unloading old data, do not
cause data movement.
- Do not split or merge a partition already populated with data
because this can cause severe locking and explosive log growth.
http://www.informit.com/articles/article.aspx?p=1946159&seqNum=5
If the leftmost end of your partition is empty, use ALTER PARTITION FUNCTION SPLIT RANGE to add new ranges to the partition function.
To check if the leftmost partition is empty, use a query like the following:
DECLARE @PartitionFunctionName sysname = 'YourPartitionFunctionNameHere';
SELECT
p.partition_number, SUM(pst.row_count) RowCountInPartition, pf.name PartitionFunction, ps.name PartitionScheme
FROM sys.dm_db_partition_stats pst
INNER JOIN sys.partitions p ON pst.partition_id = p.partition_id
INNER JOIN sys.indexes i ON p.object_id = i.object_id AND p.index_id = i.index_id
INNER JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id
INNER JOIN sys.partition_functions pf ON ps.function_id = pf.function_id
WHERE pf.name = @PartitionFunctionName
GROUP BY p.partition_number, pf.name, ps.name;If the first partition is not empty, the best practices recommend that you create a new function with all values, create a new table on that function, then insert the data to the new table.
Also, if the left partition just has a few records, a split may be fine. Not sure on that as I've never tried it.
Whatever you do, make sure to leave some empty partitions at the leftmost and rightmost partition when you're finished. I might even go so far as to creating partition ranges for 0 and 1, then add a check constraint to prevent the first partition from getting data in it. Do the same thing for the end.
Code Snippets
DECLARE @PartitionFunctionName sysname = 'YourPartitionFunctionNameHere';
SELECT
p.partition_number, SUM(pst.row_count) RowCountInPartition, pf.name PartitionFunction, ps.name PartitionScheme
FROM sys.dm_db_partition_stats pst
INNER JOIN sys.partitions p ON pst.partition_id = p.partition_id
INNER JOIN sys.indexes i ON p.object_id = i.object_id AND p.index_id = i.index_id
INNER JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id
INNER JOIN sys.partition_functions pf ON ps.function_id = pf.function_id
WHERE pf.name = @PartitionFunctionName
GROUP BY p.partition_number, pf.name, ps.name;Context
StackExchange Database Administrators Q#30398, answer score: 5
Revisions (0)
No revisions yet.