snippetMinor

How to calculate server capacity?

Submitted by: @import:stackexchange-devops·Mar 10, 2026·

Viewed 0 times

servercapacitycalculatehow

Problem

I am trying to answer a question:
Do we need more worker machines to handle current load?

We have a variable number of jobs coming in and each job takes different time. Here is a snapshot of our system for a period 12 hours.

But the number of jobs in system changes based on day and time.

How can I calculate throughput of our system and build monitoring around max load that our system can handle?

Solution

(N.B. this answer is, in the spirit of the question, high-level and not very concrete...)

How can I calculate throughput of our system

If by "calculate" you mean to take some data points, dry run some formulas and get a pretty close approximation to your throughput - that is pretty much impossible, unless you have a lot more information - which may also include pretty random parts, hence quite hard to do theoretically.

If you mean "measure", then it could be as simple as checking your logfiles and counting lines per time unit.

and build monitoring around max load that our system can handle?

Unless you're NASA, that's probably a case for trial and error. Run your system(s) and see how much troughput you get. Increase worker nodes. See what happens.

If you already know the behaviour of your overal system - i.e., whether it scales linearly, whether there are bottlenecks like databases, lock contention and such, then you can take shortcuts or good guesses.

That's one of the reasons for doing the container-based scaling we do these days - you can throw a few more workers at it relatively easily.

How you actually (technically) do that depends on what platform you are using. AWS, Kubernetes/OpenShift etc. come with techniques or settings do do it automatically for you. I assume those work mainly with the metric of "free workers" - trying to get the "free:busy worker" ratio into some target corridor so that every new request has a very high likelihood to hit a free worker, at any time, thus getting (theoretical) constant time for each request.

Context

StackExchange DevOps Q#4831, answer score: 1

Revisions (0)

No revisions yet.