patterndockerModerate
Prometheus alert CPUThrottlingHigh raised but monitoring does not show it
Viewed 0 times
raisedshowcputhrottlinghighbutmonitoringprometheusdoesnotalert
Problem
I have installed Prometheus to monitor my installation and it is frequently raising alerts about CPU throttling.
The Prometheus alert rules to identify this alert is :
If I look at one of the pods identified by this alert, it does not seem to have any reason to throttle :
This pod has one container with these resources setup :
And the node that is hosting this pod is not under any heavy cpu use :
On grafana, if I look at the chart for this pod, it never goes above
Why is it throttling ?
The Prometheus alert rules to identify this alert is :
alert: CPUThrottlingHigh
expr: 100
* sum by(container_name, pod_name, namespace) (increase(container_cpu_cfs_throttled_periods_total{container_name!=""}[5m]))
/ sum by(container_name, pod_name, namespace) (increase(container_cpu_cfs_periods_total[5m]))
> 25
for: 15mIf I look at one of the pods identified by this alert, it does not seem to have any reason to throttle :
$ kubectl top pod -n monitoring my-pod
NAME CPU(cores) MEMORY(bytes)
my-pod 0m 6MiThis pod has one container with these resources setup :
Limits:
cpu: 100m
memory: 128Mi
Requests:
cpu: 25m
memory: 64MiAnd the node that is hosting this pod is not under any heavy cpu use :
$ kubectl -n monitoring top node aks-agentpool-node-1
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
aks-agentpool-node-1 853m 21% 11668Mi 101%On grafana, if I look at the chart for this pod, it never goes above
0,000022 of cpu usageWhy is it throttling ?
Solution
The
In short, the problem is: When working with low CPU limits, spiky workloads can have low averages and still be being throttled.
Also, take a look at this issue (#67577) from Kubernetes project, which addresses a Kernel bug in CFS quotas that may cause unnecessary CPU throttling. The discussion is still open, and the Kubernetes project are even considering disabling CFS quotas for pods in the
Consider the following options:
(kubelet's flag
CPUThrottlingHigh is an alert created by the kubernetes-mixin project. There is an open issue (#108) to discuss this alert. I suggest that you read all the comments on this issue to better understand the problem.In short, the problem is: When working with low CPU limits, spiky workloads can have low averages and still be being throttled.
Also, take a look at this issue (#67577) from Kubernetes project, which addresses a Kernel bug in CFS quotas that may cause unnecessary CPU throttling. The discussion is still open, and the Kubernetes project are even considering disabling CFS quotas for pods in the
Guaranteed QoS (see #70585 for reference).Consider the following options:
- Increase (or even remove) your container CPU limits
- Disable Kubernetes CFS quotas entirely
(kubelet's flag
--cpu-cfs-quota=false)- Use a Kernel version that contains this fix (torvalds/linux 512ac99)
Context
StackExchange DevOps Q#6494, answer score: 15
Revisions (0)
No revisions yet.