debugdockerMinor
NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Viewed 0 times
networkpluginnotreadyerrormessagecnipluginreturnsinitializednotnetwork
Problem
We have a cluster with 4 worker nodes and 1 master, and the flannel CNI installed. 1 kube-flannel-ds-xxxx pod running on every node.
They used to run fine, but 1 node suddenly entered NotReady state and does not come out of it anymore.
journalctl -u kubelet -f on the node constanly emits "cni plugin not initialized"
Deleting the flannel pod makes a new one start up but the pluin keeps being uninitialized.
What can we do or check to fix this?
They used to run fine, but 1 node suddenly entered NotReady state and does not come out of it anymore.
journalctl -u kubelet -f on the node constanly emits "cni plugin not initialized"
Jul 25 14:44:05 ubdock09 kubelet[13076]: E0725 14:44:05.916280 13076 kubelet.go:2349] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"Deleting the flannel pod makes a new one start up but the pluin keeps being uninitialized.
What can we do or check to fix this?
Solution
Found the cause in journalctl of containerd.
The Ready machines did not have /etc/cni/net.d/10-flannel.conf so I just removed the /etc/cni/net.d directory and the network device cni0 that was created by the container network interface.
Then restarted containerd and the flannel pod. Now the node is ready and cni0 is recreated.
Jul 25 15:10:36 ubdock09 containerd[23164]: time="2022-07-25T15:10:36.480398235+02:00" level=error msg="failed to reload cni configuration after receiving fs change event(\"/etc/cni/net.d/.10-flannel.conf.swp\": REMOVE)" error="cni config load failed: failed to load CNI config file /etc/cni/net.d/10-flannel.conf: error parsing configuration: missing 'type': invalid cni config: failed to load cni config"The Ready machines did not have /etc/cni/net.d/10-flannel.conf so I just removed the /etc/cni/net.d directory and the network device cni0 that was created by the container network interface.
id@machine:/# ip -4 addr show
6: cni0: mtu 1500 qdisc noqueue state UP group default qlen 1000
inet 10.244.11.1/24 brd 10.244.11.255 scope global cni0
valid_lft forever preferred_lft forever
ip link delete cni0Then restarted containerd and the flannel pod. Now the node is ready and cni0 is recreated.
Code Snippets
Jul 25 15:10:36 ubdock09 containerd[23164]: time="2022-07-25T15:10:36.480398235+02:00" level=error msg="failed to reload cni configuration after receiving fs change event(\"/etc/cni/net.d/.10-flannel.conf.swp\": REMOVE)" error="cni config load failed: failed to load CNI config file /etc/cni/net.d/10-flannel.conf: error parsing configuration: missing 'type': invalid cni config: failed to load cni config"id@machine:/# ip -4 addr show
6: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
inet 10.244.11.1/24 brd 10.244.11.255 scope global cni0
valid_lft forever preferred_lft forever
ip link delete cni0Context
StackExchange DevOps Q#16344, answer score: 6
Revisions (0)
No revisions yet.