patternMinor

Autoscaling containers using local request counters

Submitted by: @import:stackexchange-devops·Mar 10, 2026·

Viewed 0 times

localcountersrequestusingcontainersautoscaling

Problem

Somewhere around I got to know about different approaches which were used to scale our webapp which included Scaling using local request counters. Below that they had written the drawbacks of this approach adding that

Each instance would reach the threshold almost at the same time and
hence each would demand a new instance, leading to a large number of
instances even though the number should have increased by one only

I was curious to know if there's a solution to this problem or any workaround ?

Solution

The problem can easily be solved using the following components:

-
On the instances serving your webapp continue to monitor the number of incoming requests – and anything else you see fit.

-
Publish the number of incoming requests to a monitoring system. If it is not yet implemented, this step will improve your monitoring capacities, and will help you to monitor the load on each hosts as well as host balancing.

-
From the incoming requests, deduce an estimated required number of webaspp servers needed to serve that work load, as well as the difference between the actual number and the estimated needed number. In the case you describe, it seems that the estimated number is just a scalar function of the total number of incoming requests on a recent period of time. On other systems or after some times, more subtle strategies can be implemented. Monitoring these quantities and the difference could ease the traceability of the auto-scaling strategy, and will monitor its responsivity.

-
Last implement the auto-scaling itself, at this point, this is really just reading some number from your monitoring system and writing it to your scaling system.

Context

StackExchange DevOps Q#830, answer score: 3

Revisions (0)

No revisions yet.