debugdockerMinor
How to fix a "heartbeat failure" in Docker Swarm?
Viewed 0 times
fixswarmdockerheartbeathowfailure
Problem
My cluster is currently located in a single data center. I've been trying to change that by adding a single worker node from another data center, but so far it hasn't worked.
I'm able to make this node join the swarm and get listed by the managers, but it is always shown as "Down". Here is what "docker inspect" shows me about this node:
I've opened the following ports in both sides:
How do I troubleshoot and fix this?
I'm able to make this node join the swarm and get listed by the managers, but it is always shown as "Down". Here is what "docker inspect" shows me about this node:
"Status": {
"State": "down",
"Message": "heartbeat failure",
"Addr": "xxx.xxx.xxx.xxx"
}I've opened the following ports in both sides:
2377 tcp
7946 tcp+udp
4789 udpHow do I troubleshoot and fix this?
Solution
This might not be the answer to your specific cross data-center-ip setup.
I occasionally run into one or more swarm-nodes being status:
What helped was to stop the docker daemon, remove
from:
https://github.com/moby/moby/issues/34827#issuecomment-457678500
sometimes it fixes itself: https://stackoverflow.com/a/54126180/2087704
I occasionally run into one or more swarm-nodes being status:
Down and Availability: Active. Having the Status.Message: "heartbeat failure". This can happen after a reboot.What helped was to stop the docker daemon, remove
/var/lib/docker/swarm/worker/tasks.db and start the docker daemon again.from:
https://github.com/moby/moby/issues/34827#issuecomment-457678500
sometimes it fixes itself: https://stackoverflow.com/a/54126180/2087704
Context
StackExchange DevOps Q#5144, answer score: 2
Revisions (0)
No revisions yet.