patternsqlMinor
Postgres cost of large volume of inserts in many tables in a single transaction
Viewed 0 times
tablespostgresinsertslargesingletransactionmanycostvolume
Problem
What is the additional cost incurred by inserting large volume or rows (millions), in many tables in a single transaction ?
Can something be done (tuning parameters) so that the cost of inserting in large volume in a single transaction approaches the cost of doing it in autocommit ?
Can something be done (tuning parameters) so that the cost of inserting in large volume in a single transaction approaches the cost of doing it in autocommit ?
Solution
I'm not sure what your limit is or what your concern is but millions of rows is not a problem. Billions of rows isn't a really a problem either. The larger the transaction the better for performance. Transactions have overhead.
Here on my old x230 I
Here is the code and results.
So you can see, you can do a million rows in a second or a billion in 35 minutes.
If you're asking why the bigger batch was slower, I think that's the overhead of WAL which would eventually show to be even greater if I did them in smaller batches (I think).
The maximum transaction size is like 2-4 billion, but just to not be excessive I would cut it off at a 2 billion rows per transaction.
Here on my old x230 I
- create a table with a million rows.
- add a million rows.
- add a BILLION rows. Damn. That's a lot of rowzzz.
Here is the code and results.
test=# CREATE TABLE foo AS SELECT id::bigint FROM generate_series(1,1e6) AS gs(id);
SELECT 1000000
Time: 722.075 ms
test=# INSERT INTO foo SELECT id FROM generate_series(1,1e6) AS gs(id);
INSERT 0 1000000
Time: 1285.631 ms
test=# INSERT INTO foo SELECT id FROM generate_series(1,1e9) AS gs(id);
INSERT 0 1000000000
Time: 2142933.903 msSo you can see, you can do a million rows in a second or a billion in 35 minutes.
If you're asking why the bigger batch was slower, I think that's the overhead of WAL which would eventually show to be even greater if I did them in smaller batches (I think).
The maximum transaction size is like 2-4 billion, but just to not be excessive I would cut it off at a 2 billion rows per transaction.
Code Snippets
test=# CREATE TABLE foo AS SELECT id::bigint FROM generate_series(1,1e6) AS gs(id);
SELECT 1000000
Time: 722.075 ms
test=# INSERT INTO foo SELECT id FROM generate_series(1,1e6) AS gs(id);
INSERT 0 1000000
Time: 1285.631 ms
test=# INSERT INTO foo SELECT id FROM generate_series(1,1e9) AS gs(id);
INSERT 0 1000000000
Time: 2142933.903 msContext
StackExchange Database Administrators Q#185320, answer score: 9
Revisions (0)
No revisions yet.