patternsqlMinor
System configuration to address poor PostgreSQL INSERT performance
Viewed 0 times
postgresqladdressinsertsystempoorperformanceconfiguration
Problem
Problem Symptom
High CPU usage by
Background
Trying to identify system/ database configuration to alleviate aforementioned poor INSERT performance. I'm using a multi-threaded R script to process data and INSERT results back into a PostgreSQL database. I've profiled the R script to isolate performance bottleneck to the
System / environment:
-
-
```
MemTotal:
High CPU usage by
postmaster child process associated with client connections attempting to INSERT low-volume rows (results in rows INSERTed 25x slower than using COPY ... FROM STDIN for identical rows).Background
Trying to identify system/ database configuration to alleviate aforementioned poor INSERT performance. I'm using a multi-threaded R script to process data and INSERT results back into a PostgreSQL database. I've profiled the R script to isolate performance bottleneck to the
DBI::dbBind() calls, while using top to monitor the postmaster child process associated to the connections opened by the child R threads (see code below). During the INSERTs, R child processes run mostly idle (presumably waiting for return of DBI::dbBind() call) whereas the postmaster child processes consume 95-100% CPU on the cores on which they run for approximately 2-3 minutes.System / environment:
- postgresql version 10.3 (Fedora package 10.3-5.fc27)
uname -a:Linux localhost 4.16.6-202.fc27.x86-64 #1 SMP Wed May 2 00:09:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
/proc/cpuinfo: 16 processors (Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz)
-
ulimit -a :core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 515220
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 515220
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited-
/proc/meminfo (at arbitrary time when postmaster processing INSERT):```
MemTotal:
Solution
A change to the prepared SQL statement resulted in a six-fold improvement.
SQL changed from:
to:
In other words, I collapsed the multiple INSERT statements into a single statement, which reduced the time from 390 sec to 66 sec, or 56 rows/sec. While still on the slow side, will need to re-profile the execution to see if the remaining latency is application- or database-related.
SQL changed from:
INSERT INTO (,...,) VALUES (?,...,?);
...
INSERT INTO (,...,) VALUES (?,...,?);to:
INSERT INTO (,...,) VALUES
(?,...,?),
...
(?,...,?);In other words, I collapsed the multiple INSERT statements into a single statement, which reduced the time from 390 sec to 66 sec, or 56 rows/sec. While still on the slow side, will need to re-profile the execution to see if the remaining latency is application- or database-related.
Code Snippets
INSERT INTO <table> (<col_1>,...,<col_n>) VALUES (?,...,?);
...
INSERT INTO <table> (<col_1>,...,<col_n>) VALUES (?,...,?);INSERT INTO <table> (<col_1>,...,<col_n>) VALUES
(?,...,?),
...
(?,...,?);Context
StackExchange Database Administrators Q#223604, answer score: 2
Revisions (0)
No revisions yet.