debugkubernetesMinor
service-to-pod communication is broken in kubernetes
Viewed 0 times
kubernetescommunicationbrokenservicepod
Problem
I have an in house 5 node cluster running on bare-metal and I am using Calico. The cluster was working for 22 days but suddenly it stopped working. After investigating the problem I found out that the service to pod communication is broken while all the components are up and kubectl is working without a problem.
From within the cluster (component A) if I try to curl another component (
ns lookup for the service is also working (it resolve to the service IP):
But the service to pod communication is broken and when I curl to the service name most of the times (60-70%) it stuck:
When I check the endpoints of the that service I can see that the IP of that pod is there:
But as I said the curl (and any other method) that uses the service name is not working. And this is the service description:
```
$ kubectl describe svc -n 170 bridge
Name: bridge
Namespace: 170
Labels: io.kompose.service=bridge
Annotations: Process: bridge
Selector: io.kompose.service=bridge
Type: ClusterIP
IP: 1
From within the cluster (component A) if I try to curl another component (
bridge) with its IP it works:$ curl -vvv http://10.4.130.184:9998
* Rebuilt URL to: http://10.4.130.184:9998/
* Trying 10.4.130.184...
* TCP_NODELAY set
* Connected to 10.4.130.184 (10.4.130.184) port 9998 (#0)
> GET / HTTP/1.1
> Host: 10.4.130.184:9998
> User-Agent: curl/7.58.0
> Accept: */*
>
Bridge
Bridge
* Connection #0 to host 10.4.130.184 left intactns lookup for the service is also working (it resolve to the service IP):
$ nslookup bridge
Server: 10.5.0.10
Address 1: 10.5.0.10 kube-dns.kube-system.svc.k8s.local
Name: bridge
Address 1: 10.5.160.50 bridge.170.svc.k8s.localBut the service to pod communication is broken and when I curl to the service name most of the times (60-70%) it stuck:
$ curl -vvv http://bridge:9998
* Rebuilt URL to: http://bridge:9998/
* Could not resolve host: bridge
* Closing connection 0
curl: (6) Could not resolve host: bridgeWhen I check the endpoints of the that service I can see that the IP of that pod is there:
$ kubectl get ep -n 170 bridge
NAME ENDPOINTS AGE
bridge 10.4.130.184:9226,10.4.130.184:9998,10.4.130.184:9226 11dBut as I said the curl (and any other method) that uses the service name is not working. And this is the service description:
```
$ kubectl describe svc -n 170 bridge
Name: bridge
Namespace: 170
Labels: io.kompose.service=bridge
Annotations: Process: bridge
Selector: io.kompose.service=bridge
Type: ClusterIP
IP: 1
Solution
Sounds like the problem has been resolved by deleting all of the
Best I can offer you is a collection of links about diagnosing Kubernetes problems:
Not a great answer I'm afraid, but we may have missed the opportunity to diagnose the issue.
kube-proxy pods, which is an aggressive solution but if it works then good work!Best I can offer you is a collection of links about diagnosing Kubernetes problems:
- Troubleshoot Applications
- Troubleshooting Kubernetes Networking Issues
- Troubleshooting GKE (GKE Specific but still useful)
- Deploy Kubespy
Not a great answer I'm afraid, but we may have missed the opportunity to diagnose the issue.
Context
StackExchange DevOps Q#9535, answer score: 1
Revisions (0)
No revisions yet.