HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

System configuration to address poor PostgreSQL INSERT performance

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqladdressinsertsystempoorperformanceconfiguration

Problem

Problem Symptom

High CPU usage by postmaster child process associated with client connections attempting to INSERT low-volume rows (results in rows INSERTed 25x slower than using COPY ... FROM STDIN for identical rows).

Background

Trying to identify system/ database configuration to alleviate aforementioned poor INSERT performance. I'm using a multi-threaded R script to process data and INSERT results back into a PostgreSQL database. I've profiled the R script to isolate performance bottleneck to the DBI::dbBind() calls, while using top to monitor the postmaster child process associated to the connections opened by the child R threads (see code below). During the INSERTs, R child processes run mostly idle (presumably waiting for return of DBI::dbBind() call) whereas the postmaster child processes consume 95-100% CPU on the cores on which they run for approximately 2-3 minutes.

System / environment:

  • postgresql version 10.3 (Fedora package 10.3-5.fc27)



  • uname -a : Linux localhost 4.16.6-202.fc27.x86-64 #1 SMP Wed May 2 00:09:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux



  • /proc/cpuinfo : 16 processors (Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz)



-
ulimit -a :

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 515220
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 515220
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


-
/proc/meminfo (at arbitrary time when postmaster processing INSERT):

```
MemTotal:

Solution

A change to the prepared SQL statement resulted in a six-fold improvement.

SQL changed from:

INSERT INTO  (,...,) VALUES (?,...,?);
    ...
    INSERT INTO  (,...,) VALUES (?,...,?);


to:

INSERT INTO  (,...,) VALUES
    (?,...,?),
    ...
    (?,...,?);


In other words, I collapsed the multiple INSERT statements into a single statement, which reduced the time from 390 sec to 66 sec, or 56 rows/sec. While still on the slow side, will need to re-profile the execution to see if the remaining latency is application- or database-related.

Code Snippets

INSERT INTO <table> (<col_1>,...,<col_n>) VALUES (?,...,?);
    ...
    INSERT INTO <table> (<col_1>,...,<col_n>) VALUES (?,...,?);
INSERT INTO <table> (<col_1>,...,<col_n>) VALUES
    (?,...,?),
    ...
    (?,...,?);

Context

StackExchange Database Administrators Q#223604, answer score: 2

Revisions (0)

No revisions yet.