principlejavascriptTip
Four Golden Signals: the minimum viable observability for any service
Viewed 0 times
four golden signalslatency percentiletraffic RPSerror ratesaturationGoogle SRE bookp99capacity planning
Problem
Teams instrument dozens of metrics but still miss user-impacting issues because the most important signals are not prominently monitored. Or conversely, teams don't know where to start with observability for a new service.
Solution
Instrument and dashboard the four golden signals defined in the Google SRE book. These four signals cover the most important aspects of any service's health:
- Latency — time to serve a request. Distinguish successful vs error latency. Use p50/p95/p99, not average.
- Traffic — demand on the system. Requests per second, queries per second.
- Errors — rate of requests that fail. Explicit (5xx) and implicit (wrong content, too slow).
- Saturation — how full the service is. CPU, memory, queue depth, connection pool utilization.
// Minimal set of Prometheus metrics implementing the four signals
const requestsTotal = meter.createCounter('requests_total', { labelNames: ['method', 'status'] });
const requestDuration = meter.createHistogram('request_duration_seconds', { buckets: [.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5] });
const queueDepth = meter.createObservableGauge('queue_depth', {});
const cpuUtilization = meter.createObservableGauge('cpu_utilization_ratio', {});Why
These four signals are the minimal set necessary to answer: is my service working? They are universally applicable to any online service regardless of technology stack.
Gotchas
- Average latency hides long tail problems — always use percentiles (p95, p99) in alerts and SLOs
- Saturation is often the leading indicator — it degrades before errors and latency spike
- Traffic metrics enable capacity planning — track both peak and baseline
- Errors should include both HTTP status codes and application-level errors returned with 200 OK
Context
Starting observability instrumentation for a new service or auditing existing coverage
Revisions (0)
No revisions yet.