HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Changing the partitioning column in SQL Server

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thecolumnsqlserverchangingpartitioning

Problem

I created a partitioned table with 100+ million rows and accidentally specified the wrong datetime column as the partitioning key when I created the table.

Is there an easy way to change the partitioning column from one datetime column to another one. In my case, we have a column InsertedDate and a CompleteDate and I accidentally used the first one instead of the second one.

Solution

Changing the partitioning column on a table with hundreds of millions of rows is likely to be a bit painful, since all rows in the table, and perhaps any associated indexes, will need to be physically moved by the operation. You need to make sure you have enough storage space available to rebuild each partition - perhaps you'll want to move the table to a set of new filegroups, then drop the old unused filegroups afterwards.

If the table has a primary key or any unique indexes defined, they must be dropped and recreated so they include the new partitioning column in the index key.

If the table has a primary key, and that primary key is referenced in other tables, you'll need to break the referential integrity in order to change the partitioning key. Since you are currently using the InsertedDate column as the partitioning key, and want to move that to use the CompletedDate column, which are presumably both date data-types, I doubt this affects your situation - it's just worth being aware of.

For a table with a primary key, the actual task of moving a table from one partitioning key to another consists of dropping the primary key constraint, then rebuilding the table with the new partitioning key as part of the primary key.

I've put together a little test-bed with 10,000,000 rows spread over 24 partitions so we can see this in action.

We'll do the work in tempdb, so make sure you have enough room for some growth. First, we'll create the partition function, partition scheme, and finally the table itself:

USE tempdb;
GO
IF OBJECT_ID(N'dbo.Tab', N'U') IS NOT NULL
BEGIN
    DROP TABLE dbo.Tab;
END
IF EXISTS (SELECT 1 FROM sys.partition_schemes ps WHERE ps.name = N'PartScheme')
BEGIN
    DROP PARTITION SCHEME PartScheme;
END
IF EXISTS (SELECT 1 FROM sys.partition_functions pf WHERE pf.name = N'PartFun')
BEGIN
    DROP PARTITION FUNCTION PartFun;
END

CREATE PARTITION FUNCTION PartFun (datetime)
AS RANGE LEFT
FOR VALUES (
      N'2012-01-01T00:00:00'
    , N'2012-04-01T00:00:00'
    , N'2012-07-01T00:00:00'
    , N'2012-10-01T00:00:00'
    , N'2013-01-01T00:00:00'
    , N'2013-04-01T00:00:00'
    , N'2013-07-01T00:00:00'
    , N'2013-10-01T00:00:00'
    , N'2014-01-01T00:00:00'
    , N'2014-04-01T00:00:00'
    , N'2014-07-01T00:00:00'
    , N'2014-10-01T00:00:00'
    , N'2015-01-01T00:00:00'
    , N'2015-04-01T00:00:00'
    , N'2015-07-01T00:00:00'
    , N'2015-10-01T00:00:00'
    , N'2016-01-01T00:00:00'
    , N'2016-04-01T00:00:00'
    , N'2016-07-01T00:00:00'
    , N'2016-10-01T00:00:00'
    , N'2017-01-01T00:00:00'
    , N'2017-04-01T00:00:00'
    , N'2017-07-01T00:00:00'
    , N'2017-10-01T00:00:00'
);

CREATE PARTITION SCHEME PartScheme
AS PARTITION PartFun
ALL TO ([PRIMARY]);

IF OBJECT_ID(N'dbo.Tab', N'U') IS NOT NULL
DROP TABLE dbo.Tab;
CREATE TABLE dbo.Tab
(
    TabID int NOT NULL
    , CreateDate datetime NOT NULL
    , UpdateDate datetime NOT NULL
    , Data1 varchar(100) NOT NULL
    , Data2 varchar(100) NOT NULL
    , Data3 varchar(100) NOT NULL
    , CONSTRAINT PK_Tab
        PRIMARY KEY CLUSTERED
        (CreateDate, TabID)
) ON [PartScheme](CreateDate);


This fills the table with 10,000,000 rows somewhat evenly spread over each partition:

;WITH Ten AS
(
    SELECT v.Num
    FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9))v(Num)
)
, TenMillion AS 
(
    SELECT Num = (t7.Num * POWER(10, 6))
        + (t6.Num * POWER(10, 5)) 
        + (t5.Num * POWER(10, 4)) 
        + (t4.Num * POWER(10, 3)) 
        + (t3.Num * POWER(10, 2)) 
        + (t2.Num * POWER(10, 1)) 
        + (t1.Num) 
    FROM Ten t1
        CROSS JOIN Ten t2
        CROSS JOIN Ten t3
        CROSS JOIN Ten t4
        CROSS JOIN Ten t5
        CROSS JOIN Ten t6
        CROSS JOIN Ten t7
)
INSERT INTO dbo.Tab (TabID, CreateDate, UpdateDate, Data1, Data2, Data3)
SELECT m.Num, DATEADD(DAY, m.Num % 1825, N'2012-01-01T00:00:00')
    , DATEADD(DAY, m.Num % 1825, N'2012-02-02T00:00:00')
    , CONVERT(varchar(100), CRYPT_GEN_RANDOM(100))
    , CONVERT(varchar(100), REPLICATE('A', 100))
    , CONVERT(varchar(100), REPLICATE('B', 100))
FROM TenMillion m;


The distribution of rows in the table:

SELECT ObjectName = s.name + '.' + o.name
    , p.partition_number
    , p.rows
FROM sys.partitions p
    INNER JOIN sys.objects o ON o.object_id = p.object_id
    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
WHERE s.name = 'dbo'
    AND o.name = 'Tab'
    AND p.index_id = 1
ORDER BY p.partition_number;


╔════════════╦══════════════════╦════════╗
║ ObjectName ║ partition_number ║ rows ║
╠════════════╬══════════════════╬════════╣
║ dbo.Tab ║ 1 ║ 5480 ║
║ dbo.Tab ║ 2 ║ 498680 ║
║ dbo.Tab ║ 3 ║ 498680 ║
║ dbo.Tab ║ 4 ║ 504160 ║
║ dbo.Tab ║ 5 ║ 504160 ║
║ dbo.Tab ║ 6 ║ 493200 ║
║ dbo.Tab ║ 7 ║ 498680 ║
║ dbo.Tab ║ 8 ║ 504160 ║
║ dbo.Tab ║ 9

Code Snippets

USE tempdb;
GO
IF OBJECT_ID(N'dbo.Tab', N'U') IS NOT NULL
BEGIN
    DROP TABLE dbo.Tab;
END
IF EXISTS (SELECT 1 FROM sys.partition_schemes ps WHERE ps.name = N'PartScheme')
BEGIN
    DROP PARTITION SCHEME PartScheme;
END
IF EXISTS (SELECT 1 FROM sys.partition_functions pf WHERE pf.name = N'PartFun')
BEGIN
    DROP PARTITION FUNCTION PartFun;
END

CREATE PARTITION FUNCTION PartFun (datetime)
AS RANGE LEFT
FOR VALUES (
      N'2012-01-01T00:00:00'
    , N'2012-04-01T00:00:00'
    , N'2012-07-01T00:00:00'
    , N'2012-10-01T00:00:00'
    , N'2013-01-01T00:00:00'
    , N'2013-04-01T00:00:00'
    , N'2013-07-01T00:00:00'
    , N'2013-10-01T00:00:00'
    , N'2014-01-01T00:00:00'
    , N'2014-04-01T00:00:00'
    , N'2014-07-01T00:00:00'
    , N'2014-10-01T00:00:00'
    , N'2015-01-01T00:00:00'
    , N'2015-04-01T00:00:00'
    , N'2015-07-01T00:00:00'
    , N'2015-10-01T00:00:00'
    , N'2016-01-01T00:00:00'
    , N'2016-04-01T00:00:00'
    , N'2016-07-01T00:00:00'
    , N'2016-10-01T00:00:00'
    , N'2017-01-01T00:00:00'
    , N'2017-04-01T00:00:00'
    , N'2017-07-01T00:00:00'
    , N'2017-10-01T00:00:00'
);

CREATE PARTITION SCHEME PartScheme
AS PARTITION PartFun
ALL TO ([PRIMARY]);

IF OBJECT_ID(N'dbo.Tab', N'U') IS NOT NULL
DROP TABLE dbo.Tab;
CREATE TABLE dbo.Tab
(
    TabID int NOT NULL
    , CreateDate datetime NOT NULL
    , UpdateDate datetime NOT NULL
    , Data1 varchar(100) NOT NULL
    , Data2 varchar(100) NOT NULL
    , Data3 varchar(100) NOT NULL
    , CONSTRAINT PK_Tab
        PRIMARY KEY CLUSTERED
        (CreateDate, TabID)
) ON [PartScheme](CreateDate);
;WITH Ten AS
(
    SELECT v.Num
    FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9))v(Num)
)
, TenMillion AS 
(
    SELECT Num = (t7.Num * POWER(10, 6))
        + (t6.Num * POWER(10, 5)) 
        + (t5.Num * POWER(10, 4)) 
        + (t4.Num * POWER(10, 3)) 
        + (t3.Num * POWER(10, 2)) 
        + (t2.Num * POWER(10, 1)) 
        + (t1.Num) 
    FROM Ten t1
        CROSS JOIN Ten t2
        CROSS JOIN Ten t3
        CROSS JOIN Ten t4
        CROSS JOIN Ten t5
        CROSS JOIN Ten t6
        CROSS JOIN Ten t7
)
INSERT INTO dbo.Tab (TabID, CreateDate, UpdateDate, Data1, Data2, Data3)
SELECT m.Num, DATEADD(DAY, m.Num % 1825, N'2012-01-01T00:00:00')
    , DATEADD(DAY, m.Num % 1825, N'2012-02-02T00:00:00')
    , CONVERT(varchar(100), CRYPT_GEN_RANDOM(100))
    , CONVERT(varchar(100), REPLICATE('A', 100))
    , CONVERT(varchar(100), REPLICATE('B', 100))
FROM TenMillion m;
SELECT ObjectName = s.name + '.' + o.name
    , p.partition_number
    , p.rows
FROM sys.partitions p
    INNER JOIN sys.objects o ON o.object_id = p.object_id
    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
WHERE s.name = 'dbo'
    AND o.name = 'Tab'
    AND p.index_id = 1
ORDER BY p.partition_number;
BEGIN TRANSACTION;

BEGIN TRY
    ALTER TABLE dbo.Tab
    DROP CONSTRAINT PK_Tab
    WITH (MOVE TO PartScheme(UpdateDate));

    ALTER TABLE dbo.Tab
    ADD CONSTRAINT PK_Tab
    PRIMARY KEY CLUSTERED
    (UpdateDate, TabID)
    ON PartScheme(UpdateDate);

    COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    ROLLBACK TRANSACTION
END CATCH
SELECT ObjectName = s.name + '.' + o.name
    , p.partition_number
    , p.rows
FROM sys.partitions p
    INNER JOIN sys.objects o ON o.object_id = p.object_id
    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
WHERE s.name = 'dbo'
    AND o.name = 'Tab'
    AND p.index_id = 1
ORDER BY p.partition_number;

Context

StackExchange Database Administrators Q#184483, answer score: 5

Revisions (0)

No revisions yet.