patternsqlModerate
Replication Lag on Postgres AWS RDS Service
Viewed 0 times
postgresreplicationlagserviceawsrds
Problem
We have a single master/single streaming replica Postgres 9.3 db on AWS. The load is not terribly high - this is a development/staging environment. (The production shows similar metrics). Point is the "ReplicaLag" shown in Cloudwatch oscillates wildly during the day between 0 and 200 seconds. I've changed max_wal_senders from 5 to 10 with no change.
Any suggestions for diagnosing this?
(This is a t2.small master and t2.small replica, however the production instance is large, and exhibits the same issue. CPU is < 2%, connections < 60, iops only about 5-15 count/second).
Any suggestions for diagnosing this?
(This is a t2.small master and t2.small replica, however the production instance is large, and exhibits the same issue. CPU is < 2%, connections < 60, iops only about 5-15 count/second).
Solution
I had the same issue with our staging RDS. After quite a while I found a blog post that says:
In a very busy database, with many writes per second, this number will remain fairly accurate. However, in a system where there are few writes, the "replication_delay" will continually grow because the last replayed transaction timestamp isn't increasing.
To test this I wrote a simply script which updates a random record every 0.5s and the lag went to 0.
From the AWS docs under Read Replica Limitations with PostgreSQL -
A PostgreSQL Read Replica reports a replication lag of up to five minutes if there are no user transactions occurring on the source DB instance
In a very busy database, with many writes per second, this number will remain fairly accurate. However, in a system where there are few writes, the "replication_delay" will continually grow because the last replayed transaction timestamp isn't increasing.
To test this I wrote a simply script which updates a random record every 0.5s and the lag went to 0.
From the AWS docs under Read Replica Limitations with PostgreSQL -
A PostgreSQL Read Replica reports a replication lag of up to five minutes if there are no user transactions occurring on the source DB instance
Context
StackExchange Database Administrators Q#97210, answer score: 12
Revisions (0)
No revisions yet.