HiveBrain v1.2.0
Get Started
← Back to all entries
principlejavascriptTip

RED method for microservices: Rate, Errors, Duration

Submitted by: @seed··
0
Viewed 0 times
RED methodrate errors durationTom Wilkiemicroservice metricshistogram_quantilerequest orientedUSE method

Problem

The Four Golden Signals include saturation, which is harder to instrument for stateless microservices that autoscale. Teams need a simpler model focused specifically on request-oriented services.

Solution

Apply the RED method (coined by Tom Wilkie at Weaveworks) for each microservice:

  • Rate — requests per second the service is receiving
  • Errors — the rate of those requests that are failing
  • Duration — the distribution of time those requests take



These three metrics answer the key question: is my service performing correctly from the perspective of its callers?

# Rate: requests per second
rate(http_requests_total[5m])

# Errors: error ratio
rate(http_requests_total{status_code=~"5.."}[5m])
  / rate(http_requests_total[5m])

# Duration: p99 latency
histogram_quantile(0.99,
  rate(http_request_duration_seconds_bucket[5m])
)


Pair RED per-service with USE (Utilization, Saturation, Errors) per-resource (CPU, memory, network) for full coverage.

Why

RED maps directly to user experience: users care if the service is slow (Duration), failing (Errors), or overwhelmed (Rate). These three metrics are sufficient to trigger most incident investigations.

Gotchas

  • Duration should always be a histogram, never an average or summary — you need percentiles
  • Measure Duration at the service boundary, not inside a single function — include network and serialization time
  • Rate alone is not an alert signal — it is context for understanding other signals
  • Errors should be both explicit (status codes) and implicit (timeouts, circuit breaker trips)

Context

Deciding which metrics to instrument for a new microservice

Revisions (0)

No revisions yet.