patternsqlMinor
FATAL 53300: Remaining connection slots are reserved for non-replication superuser connections
Viewed 0 times
53300remainingnonarefatalreplicationslotsforreservedsuperuser
Problem
I have a PostgreSQL 12.1 database system (I'll refer to it as PGSQL) running on a remotely hosted VM server (Windows Server 2019). We upgraded the server OS and PGSQL a couple of months ago. Everything has been running more-or-less normally since then, until this morning when I started receiving the above-mentioned database error in pretty much every one of our in-house applications that connect to this PGSQL instance.
To check the connections, I ran
Now, many of the applications we have in place are unfortunately built with hard-coded credentials (I have a lot of "clean-up" work to do on the code for these applications that I inherited) and are generally being executed from shortcuts pointing to an "Application" share on the server that's hosting the PGSQL database, so none of this looks particularly "suspicious". I tried to simply kill the processes using
Perhaps I'm not using the correct function to terminate these "hung" processes or something, but I could not figure out how to clear out these connections individually. Because I needed to get our production environment back to a usable state, I ended up just stopping and restarting the PGSQL service on the server which did clear out all of those old
To check the connections, I ran
SELECT * FROM pg_stat_activity;, which returned 103 rows. My postgresql.conf file has max_connections = 100, so that makes sense, but what doesn't make sense is that, of those 103 connections, 90+ of them are listed as idle with query text of DISCARD ALL. All of these show as being executed by the same, non-superuser account from the server's own internal address. However, several of the connections show a query_start date value from a month or more ago.Now, many of the applications we have in place are unfortunately built with hard-coded credentials (I have a lot of "clean-up" work to do on the code for these applications that I inherited) and are generally being executed from shortcuts pointing to an "Application" share on the server that's hosting the PGSQL database, so none of this looks particularly "suspicious". I tried to simply kill the processes using
SELECT pg_cancel_backend(); on one of the pid values from the previous query, but requerying pg_stat_activity still shows the same record in the result set (all of the values appear to be exactly the same, from what I can tell).Perhaps I'm not using the correct function to terminate these "hung" processes or something, but I could not figure out how to clear out these connections individually. Because I needed to get our production environment back to a usable state, I ended up just stopping and restarting the PGSQL service on the server which did clear out all of those old
DISCARD ALL statements, but I'm curious if there's something I couSolution
Regardless of whatever issues you might have,
More fundamentally, it seems like you have an application problem or a connection pooler problem, but you haven't described them enough for us to give advice.
If you just want to kick the can down the road a bit further, you can increase max_connections. Having a large number of idle connections (not "idle in transaction"!) is not desirable, but is also not a huge problem.
prior to upgrading our PGSQL server to v12.1, we ran v9.4 for a number of years and never once encountered this issue.
So 0 vs 1? That doesn't sound like a sample size on which you can draw a conclusion. Maybe 9.4 just got rebooted for other reasons often enough to keep the leak under control. Or maybe someone increased the max_connections to a large enough value that it was never reached between reboots, but that increase didn't survive the upgrade.
pg_cancel_backend cancels the backend's current query. But an idle backend doesn't have a current query, which is what makes it idle. You want pg_terminate_backend.More fundamentally, it seems like you have an application problem or a connection pooler problem, but you haven't described them enough for us to give advice.
If you just want to kick the can down the road a bit further, you can increase max_connections. Having a large number of idle connections (not "idle in transaction"!) is not desirable, but is also not a huge problem.
prior to upgrading our PGSQL server to v12.1, we ran v9.4 for a number of years and never once encountered this issue.
So 0 vs 1? That doesn't sound like a sample size on which you can draw a conclusion. Maybe 9.4 just got rebooted for other reasons often enough to keep the leak under control. Or maybe someone increased the max_connections to a large enough value that it was never reached between reboots, but that increase didn't survive the upgrade.
Context
StackExchange Database Administrators Q#264921, answer score: 4
Revisions (0)
No revisions yet.