HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

MongoDB hangs up on shutdown

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
shutdownhangsmongodb

Problem

I'm having issues with MongoDB replica set: when I try to shutdown a MongoDB instance, it hangs up and ends up being killed by systemd. When it starts again, it finds itself way behind the primary, although it was lagging less than 30 sec before the restart, and thus requires full resync to recover.

Log looks like so:

```
Jul 11 21:38:19 mongo-rs0-1 systemd[1]: Stopping mongo_rs0.service...
Jul 11 21:38:19 mongo-rs0-1 numactl[26076]: MongoDB shell version v3.4.5
Jul 11 21:38:19 mongo-rs0-1 numactl[26076]: connecting to: mongodb://127.0.0.1:27017/admin
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [thread1] connection accepted from 127.0.0.1:51524 #11301 (32 connections now open)
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] received client metadata from 127.0.0.1:51524 conn11301: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.5" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
Jul 11 21:38:19 mongo-rs0-1 numactl[26076]: MongoDB server version: 3.4.5
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] terminating, shutdown command received
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] shutdown: going to close listening sockets...
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] closing listening socket: 7
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] closing listening socket: 8
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] removing socket file: /tmp/mongodb-27017.sock
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] shutdown: going to flush diaglog...
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] shutting down replication subsystems
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [conn11301] Stopping replication reporter thread
Jul 11 21:38:19 mongo-rs0-1 mongod.27017[10705]: [SyncSourceFeedback] SyncSourceFeedback error sending update to X.X.X.X:27017: CallbackCancele

Solution

Upgrading to 3.4.6 helped. The issue is now resolved.

Quick summary:

  • MongoDB used realtime clock as if it is monotonic while it is not by design (see ticket WT-3327).



  • As a result, when realtime clock fluctuated backwards far enough, MongoDB could get into an infinite sleep.



  • Normally, accuracy of realtime clock is pretty good so this did not cause issues on most of the systems.



  • However, realtime clock is very inaccurate on Azure VMs due to Hyper-V virtualization's side effects.



  • This bug could cause other operational issues too, specifically, broken rotation of journal files (see related bugs in the ticket WT-3327 referenced above).

Context

StackExchange Database Administrators Q#179616, answer score: 3

Revisions (0)

No revisions yet.