HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Change Column Data Type from numeric(18, 2) to numeric(22, 6) on a BIG Table

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
columnnumerictypebigfromdatachangetable

Problem

We have a very big POSTGRES table containing more than 8 BILLION rows and growing at a very high rate (30 million rows per day).

Our Table is Partitioned on date. Each partition contains 6 months data except the first partition which contains data for almost 18 months.

Postgres version: PostgreSQL 12.11

Database specs: db.r6g.16xlarge (vCPU:64, memory:512,
EBSBandwidth(Mbps):19000)

Table schema:

create table table_1
(
    id       bigint default nextval('table_1_id_seq'::regclass) not null,
    user_id  integer                                            not null,
    amount   numeric(18, 2)                                     not null,
    date     timestamp                                          not null,
    column_1 varchar                                            not null,
    column_2 varchar,
    .
    .
    primary key (id, date)
) partition by RANGE (date);


This table also contains multiple indexes (including partial indexes)

create index index_1
    on table_1 (column_1);

create index index_2
    on table_1 (date, column_2, column_3);

create index index_3
    on table_1 (column_4);

create index index_4
    on table_1 (user_id, column_3);

create index index_5
    on table_1 (user_id asc, date desc)
    where (column_3 = ANY (ARRAY [1, 2, 3, 6, 7, 8, 10]));

create index index_6
    on table_1 (user_id, column_2, column_3, column_5);


Initially we created amount column with data type numeric(18, 2). Now we need to support higher precision upto 6. So we need to change our type to numeric(22, 6).

Now running an Alter command like below one is taking a lot of time (hours):

NOTE: we ran this command on the table without dropping any indexes

ALTER TABLE table_1 ALTER COLUMN amount type numeric(22, 6);


Few of the different strategies that we are exploring:

-
Drop all the indexes, triggers and foreign keys while the update runs and recreate them at the end.

-
Create new table:

a. Create a new partitioned table with e

Solution

What I would do is the following:

  • add the new column with the desired data type



  • change application logic so that it writes into both the old and new columns - this way new rows will have the new column filled with data



  • copy values from the old column to the new one. Do this gradually, so that the system can still operate normally (probably somewhat slower than usual). There is a good chance you want to do a VACUUM ANALYZE after a certain number of rows were updated.



  • once all rows are updated, you can do a pair of ALTER TABLE for renaming both columns in the same transaction (the new one to amount, the old one to something else). This will normally be a very fast operation. Check beforehand that the application handles the data type change nicely on all read paths.



  • eventually drop the old column, after removing writing to it on the application side



In my experience this is the way you have to plan with the shortest downtime possible. The total time for the change will be (much) more than with a simple data type change. On the other hand, you can revert to the old column until the very end, in case some problems appear. Obviously, test the process in a test environment beforehand.

Note 1: as mentioned in comments, there is a chance you cannot swap in the new column and change the application code at the same time. In that case, you'll need three ALTER TABLE statements inside the transaction:

amount -> amount_old
amount_new -> amount
amount_old -> amount_new (looks confusing first, but the application will be happy)


Then change the application code to use the higher precision for amount, and the original one for amount_new.

Note 2: If you happen to use an older Postgres version (up to 10), add the new column without a default, otherwise the whole table will be rewritten. In later versions, this happens only with volatile default values - check if your use case matches any of these two cases.

Code Snippets

amount -> amount_old
amount_new -> amount
amount_old -> amount_new (looks confusing first, but the application will be happy)

Context

StackExchange Database Administrators Q#317239, answer score: 8

Revisions (0)

No revisions yet.