patternsqlMajor
Best practices for schema changes and data migrations to a live database without downtime?
Viewed 0 times
withoutdowntimedatabasemigrationslivepracticesforchangesanddata
Problem
How do you make schema changes to a live database without downtime?
For instance, lets say I have a PostgreSQL database with a table including various user data like email addresses etc, all associated with specific users. If I wanted to move the email addresses to a new dedicated table, I'd have to change the schema and then migrate the email data across to the new table. How could this be done without stopping writes to the original table? Surely while data is written over from the old table to the new one, new data would continue to be written to the old table and be missed, right?
I guess this problem comes up pretty frequently but I can't find any standard solution for dealing with it.
This article deals with the problem but I didn't really understand step 3. He says to write to both tables, then migrate old data from the first table to the new one. How do you make sure you're only migrating old data?
(I use PostgreSQL on Heroku.)
For instance, lets say I have a PostgreSQL database with a table including various user data like email addresses etc, all associated with specific users. If I wanted to move the email addresses to a new dedicated table, I'd have to change the schema and then migrate the email data across to the new table. How could this be done without stopping writes to the original table? Surely while data is written over from the old table to the new one, new data would continue to be written to the old table and be missed, right?
I guess this problem comes up pretty frequently but I can't find any standard solution for dealing with it.
This article deals with the problem but I didn't really understand step 3. He says to write to both tables, then migrate old data from the first table to the new one. How do you make sure you're only migrating old data?
(I use PostgreSQL on Heroku.)
Solution
You almost have your answer already:
As for step 3, use something like this (in one transaction):
Insert what is not there yet:
Update what has changed in the meantime:
New data will not be touched, because it is identical in both places.
- Create the new structure in parallel
- Start writing to both structures
- Migrate old data to the new structure
- Only write and read new structure
- Delete old columns
As for step 3, use something like this (in one transaction):
Insert what is not there yet:
INSERT INTO new_tbl (old_id, data)
SELECT old_id, data
FROM old_tbl
WHERE NOT EXISTS (SELECT * FROM new_tbl WHERE new_tbl.old_id = old_tbl.old_id);Update what has changed in the meantime:
UPDATE new_tbl
SET data = old.data
USING old_tbl
WHERE new_tbl.old_id = old_tbl.old_id
AND new_tbl.data IS DISTINCT FROM old_tbl.data;New data will not be touched, because it is identical in both places.
Code Snippets
INSERT INTO new_tbl (old_id, data)
SELECT old_id, data
FROM old_tbl
WHERE NOT EXISTS (SELECT * FROM new_tbl WHERE new_tbl.old_id = old_tbl.old_id);UPDATE new_tbl
SET data = old.data
USING old_tbl
WHERE new_tbl.old_id = old_tbl.old_id
AND new_tbl.data IS DISTINCT FROM old_tbl.data;Context
StackExchange Database Administrators Q#10913, answer score: 33
Revisions (0)
No revisions yet.