patternMinor
Postgresql large transaction OOM killer
Viewed 0 times
postgresqlkilleroomlargetransaction
Problem
We have a
The server has 16GB of ram.
We are running a long transaction which adds and updates 10s of millions of rows in multiple tables. Besides this transaction no query is running on the DB in this stage but it can be run during other queries so it must be transactioned.
It works fine until a point where the OOM killer decides it's enough and kills the
Here are the relevant postgres configurations:
```
#-------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#-------------------------------------------------------------------------
# - Memory -
shared_buffers = 128MB # min 128kB
# (change requires restart)
#huge_pages = try # on, off, or try
# (change requires restart)
#temp_buffers = 8MB # min 800kB
#max_prepared_transactions = 0 # zero disables the feature
# (change requires restart)
# Caution: it is not advisable to set max_prepared_transactions nonzero unless
# you actively intend to use prepared transactions.
#work_mem = 4MB # min 64kB
#maintenance_work_mem = 64MB # min 1MB
#autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
#max_stack_depth = 2MB # min 100kB
dynamic_shared_memory_type = posix # the default is the first option
# supported by the operating system:
# posix
# sysv
# windows
Postgresql 9.5 running on CentOS release 6.5 (Final)The server has 16GB of ram.
We are running a long transaction which adds and updates 10s of millions of rows in multiple tables. Besides this transaction no query is running on the DB in this stage but it can be run during other queries so it must be transactioned.
It works fine until a point where the OOM killer decides it's enough and kills the
postmaster process:Out of memory: Kill process 1766 (postmaster) score 890 or sacrifice child
Killed process 1766, UID 26, (postmaster) total-vm:24384508kB, anon-rss:14376288kB, file-rss:138616kBHere are the relevant postgres configurations:
```
#-------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#-------------------------------------------------------------------------
# - Memory -
shared_buffers = 128MB # min 128kB
# (change requires restart)
#huge_pages = try # on, off, or try
# (change requires restart)
#temp_buffers = 8MB # min 800kB
#max_prepared_transactions = 0 # zero disables the feature
# (change requires restart)
# Caution: it is not advisable to set max_prepared_transactions nonzero unless
# you actively intend to use prepared transactions.
#work_mem = 4MB # min 64kB
#maintenance_work_mem = 64MB # min 1MB
#autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
#max_stack_depth = 2MB # min 100kB
dynamic_shared_memory_type = posix # the default is the first option
# supported by the operating system:
# posix
# sysv
# windows
Solution
Ok,
So after trying a lot of things, I found the solution.
The issue was hiding in our use of the JDBC library.
We had a long connection with auto commit set to false:
During that time we were doing a lot of small queries and a few queries with a cursor:
In JDBC you create a
Now, every one of these objects needs to be closed, but if you close statement, the entry set is closed, and if you close the connection all the statements are closed and their result sets.
We were used to short living queries with connections of their own so we never closed statements assuming the connection will handle the things once it is closed.
The problem was now with this long transaction (~24 hours) which never closed the connection. The statements were never closed.
Apparently, the statement object holds resources both on the server that runs the code and on the PostgreSQL database.
My best guess to what resources are left in the DB is the things related to the cursor. The statements that used the cursor were never closed, so the result set they returned never closed as well. This meant the database didn't free the relevant cursor resources in the DB, and since it was over a huge table it took a lot of RAM.
The only thing I don't understand is why PostgreSQL didn't save some of the resources on the DISK (not swap) when it was failing to allocate memory.
Hope it will help some one in the future with similar problem.
So after trying a lot of things, I found the solution.
The issue was hiding in our use of the JDBC library.
We had a long connection with auto commit set to false:
connection.setAutoCommit(false)During that time we were doing a lot of small queries and a few queries with a cursor:
statement.setFetchSize(SOME_FETCH_SIZE)In JDBC you create a
connection object, and from that connection you create statements. When you execute the statments you get a result set.Now, every one of these objects needs to be closed, but if you close statement, the entry set is closed, and if you close the connection all the statements are closed and their result sets.
We were used to short living queries with connections of their own so we never closed statements assuming the connection will handle the things once it is closed.
The problem was now with this long transaction (~24 hours) which never closed the connection. The statements were never closed.
Apparently, the statement object holds resources both on the server that runs the code and on the PostgreSQL database.
My best guess to what resources are left in the DB is the things related to the cursor. The statements that used the cursor were never closed, so the result set they returned never closed as well. This meant the database didn't free the relevant cursor resources in the DB, and since it was over a huge table it took a lot of RAM.
The only thing I don't understand is why PostgreSQL didn't save some of the resources on the DISK (not swap) when it was failing to allocate memory.
Hope it will help some one in the future with similar problem.
Code Snippets
connection.setAutoCommit(false)statement.setFetchSize(SOME_FETCH_SIZE)Context
StackExchange Database Administrators Q#206448, answer score: 2
Revisions (0)
No revisions yet.