snippetdockerMinor
How do I make NFS volumes reliable in Docker?
Viewed 0 times
reliablevolumesdockermakenfshow
Problem
I host various docker containers on my Ubuntu 18 machine. A few of them require storing their data on my Synology NAS. At first, I was using the host machine's
However, I figured it would be a better idea to have the containers map NFS to mount points in the containers. The host really has no use for the mounts, so it didn't make sense to maintain them there.
At the moment, I am configuring my NFS volumes like this (using Docker Compose v3 format):
This works great when the NAS is booted and working normally. However, I had a power outage and noticed all sorts of problems. Also, the timing of which machines boot first (NAS vs Ubuntu box) affects reliability of my docker container volumes. In my last situation, the NAS was not powered on. So when the container was started, it failed:
ERROR: for app Cannot start service app: error while mounting volume '/var/lib/docker/volumes/nextcloud_data/_data': error while mounting volume with options: type='nfs' device=':/volume2/nextcloud' o='addr=192.168.1.51,nolock,soft,rw': no route to host
What would be nice is if docker would keep retrying to mount the volume until the NAS was powered on again. That would make this hands-free and prevent any timing issues (on which devices boot first across the network) from causing permanent failures like this.
I'm also not sure what happens if a volume is created, and the NAS is powered off at any point. Does the volume stay available? Does docker keep trying to reconnect the NFS mount? I feel like there is very little control here.
Note I just use Docker Compose. I don't use Swarm for technical reasons I won't go into here. Can someone recommend a way to resolve these reliability issues? Are NFS volumes in Docker the way to go? Should I go ba
/etc/fstab to control NFS mounts, which I then mounted in the containers (as a mount, not a volume).However, I figured it would be a better idea to have the containers map NFS to mount points in the containers. The host really has no use for the mounts, so it didn't make sense to maintain them there.
At the moment, I am configuring my NFS volumes like this (using Docker Compose v3 format):
volumes:
data:
driver_opts:
type: nfs
o: addr=192.168.1.51,nolock,soft,rw
device: :/volume2/nextcloudThis works great when the NAS is booted and working normally. However, I had a power outage and noticed all sorts of problems. Also, the timing of which machines boot first (NAS vs Ubuntu box) affects reliability of my docker container volumes. In my last situation, the NAS was not powered on. So when the container was started, it failed:
ERROR: for app Cannot start service app: error while mounting volume '/var/lib/docker/volumes/nextcloud_data/_data': error while mounting volume with options: type='nfs' device=':/volume2/nextcloud' o='addr=192.168.1.51,nolock,soft,rw': no route to host
What would be nice is if docker would keep retrying to mount the volume until the NAS was powered on again. That would make this hands-free and prevent any timing issues (on which devices boot first across the network) from causing permanent failures like this.
I'm also not sure what happens if a volume is created, and the NAS is powered off at any point. Does the volume stay available? Does docker keep trying to reconnect the NFS mount? I feel like there is very little control here.
Note I just use Docker Compose. I don't use Swarm for technical reasons I won't go into here. Can someone recommend a way to resolve these reliability issues? Are NFS volumes in Docker the way to go? Should I go ba
Solution
There are options in docker that will retry the startup of a container, restart in version 2 and restart-policy in version 3 (you'd need compatibility mode enabled for the version 3 syntax to work). However, I believe they only work when the issue is from the application inside the container fails, not when there is an issue creating the container like you see with a volume mount failing (or would also happen if the image couldn't be retrieved from a registry).
To handle the failing volume mount, I believe swarm mode is your best option despite your objections. You can create a single node cluster with
From the above, you can see as soon as the missing directory was created, the bind mount succeeded, and the container that failed to be created as retried and started successfully.
To handle the failing volume mount, I believe swarm mode is your best option despite your objections. You can create a single node cluster with
docker swarm init and deploy your compose file with docker stack deploy -c docker-compose.yml stack_name, making an easy transition from docker-compose. Swarm mode looks at the overall state of the service and continuously tries to make the current state match your target state (as defined in the compose file), which will handle a failing volume mount that eventually corrects itself. I don't have an NFS server to test on right now, but here's a scenario with a bind mount to a missing folder:$ cat docker-compose.vol-bind.yml
version: '3'
volumes:
bind-test:
driver: local
driver_opts:
type: none
o: bind
device: /home/bmitch/data/docker/test/missing
services:
bind-test:
image: busybox
command: tail -f /dev/null
volumes:
- bind-test:/bind-test
$ docker stack deploy -c docker-compose.vol-bind.yml voltest
Creating network voltest_default
Creating service voltest_bind-test
$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
omzaeo7mrour voltest_bind-test replicated 0/1 busybox:latest
$ docker service ps omzaeo7mrour
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR POR
TS
kpz0l79eucaw voltest_bind-test.1 busybox:latest bmitch-asusr556l Ready Ready 2 seconds ago
j6fylzhvcv60 \_ voltest_bind-test.1 busybox:latest bmitch-asusr556l Shutdown Failed 5 seconds ago "starting container failed: er…"
61o6raohp0xl \_ voltest_bind-test.1 busybox:latest bmitch-asusr556l Shutdown Failed 12 seconds ago "starting container failed: er…"
$ docker inspect kpz0l79eucaw
[
{
"ID": "kpz0l79eucaw1856obmwvcak1",
"Version": {
"Index": 445
},
"CreatedAt": "2019-04-29T12:57:25.925788528Z",
"UpdatedAt": "2019-04-29T12:57:34.3467203Z",
"Labels": {},
"Spec": {
"ContainerSpec": {
"Image": "busybox:latest",
"Labels": {
"com.docker.stack.namespace": "voltest"
},
...
"Status": {
"Timestamp": "2019-04-29T12:57:33.936048295Z",
"State": "failed",
"Message": "starting",
"Err": "starting container failed: error while mounting volume '/home/var-docker/volumes/voltest_bind-test/_data': failed to mount local volume: mount /home
/bmitch/data/docker/test/missing:/home/var-docker/volumes/voltest_bind-test/_data, flags: 0x1000: no such file or directory",
"ContainerStatus": {
"ContainerID": "4c75c851bd43b5ef57b4785a4611f78501279933822eab90c693e1108f53ee82",
"PID": 0,
"ExitCode": 128
},
"PortStatus": {}
},
...
$ mkdir missing
$ docker service ps omzaeo7mrour
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR POR
TS
okylli7smu1c voltest_bind-test.1 busybox:latest bmitch-asusr556l Running Running 4 seconds ago
9tpm188ysu2k \_ voltest_bind-test.1 busybox:latest bmitch-asusr556l Shutdown Failed 14 seconds ago "starting container failed: er…"
kpz0l79eucaw \_ voltest_bind-test.1 busybox:latest bmitch-asusr556l Shutdown Failed 21 seconds ago "starting container failed: er…"From the above, you can see as soon as the missing directory was created, the bind mount succeeded, and the container that failed to be created as retried and started successfully.
Code Snippets
$ cat docker-compose.vol-bind.yml
version: '3'
volumes:
bind-test:
driver: local
driver_opts:
type: none
o: bind
device: /home/bmitch/data/docker/test/missing
services:
bind-test:
image: busybox
command: tail -f /dev/null
volumes:
- bind-test:/bind-test
$ docker stack deploy -c docker-compose.vol-bind.yml voltest
Creating network voltest_default
Creating service voltest_bind-test
$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
omzaeo7mrour voltest_bind-test replicated 0/1 busybox:latest
$ docker service ps omzaeo7mrour
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR POR
TS
kpz0l79eucaw voltest_bind-test.1 busybox:latest bmitch-asusr556l Ready Ready 2 seconds ago
j6fylzhvcv60 \_ voltest_bind-test.1 busybox:latest bmitch-asusr556l Shutdown Failed 5 seconds ago "starting container failed: er…"
61o6raohp0xl \_ voltest_bind-test.1 busybox:latest bmitch-asusr556l Shutdown Failed 12 seconds ago "starting container failed: er…"
$ docker inspect kpz0l79eucaw
[
{
"ID": "kpz0l79eucaw1856obmwvcak1",
"Version": {
"Index": 445
},
"CreatedAt": "2019-04-29T12:57:25.925788528Z",
"UpdatedAt": "2019-04-29T12:57:34.3467203Z",
"Labels": {},
"Spec": {
"ContainerSpec": {
"Image": "busybox:latest",
"Labels": {
"com.docker.stack.namespace": "voltest"
},
...
"Status": {
"Timestamp": "2019-04-29T12:57:33.936048295Z",
"State": "failed",
"Message": "starting",
"Err": "starting container failed: error while mounting volume '/home/var-docker/volumes/voltest_bind-test/_data': failed to mount local volume: mount /home
/bmitch/data/docker/test/missing:/home/var-docker/volumes/voltest_bind-test/_data, flags: 0x1000: no such file or directory",
"ContainerStatus": {
"ContainerID": "4c75c851bd43b5ef57b4785a4611f78501279933822eab90c693e1108f53ee82",
"PID": 0,
"ExitCode": 128
},
"PortStatus": {}
},
...
$ mkdir missing
$ docker service ps omzaeo7mrour
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR POR
TS
okylli7smu1c voltest_bind-test.1 busybox:latest bmitch-asusr556l Running Running 4 seconds ago
9tpm188ysu2k \_ voltest_bind-test.1 busybox:latest bmitch-asusr556l Shutdown Context
StackExchange DevOps Q#7970, answer score: 2
Revisions (0)
No revisions yet.