principleModeratepending
Principle: Logs are not observability
Viewed 0 times
observabilitythree pillarslogsmetricstracesSLO
Problem
Teams rely solely on logs for debugging production issues, missing the bigger picture that metrics and traces provide.
Solution
The three pillars of observability serve different purposes:
Logs: What happened (individual events)
Metrics: How much is happening (aggregated measurements)
Traces: How requests flow (distributed context)
Decision framework:
Don't try to solve every problem with logs. Use the right tool for the question you're asking.
Logs: What happened (individual events)
- Use for: Detailed debugging, audit trails, error context
- Not for: Alerting (too noisy), performance analysis
- Example: 'User alice failed login: invalid password'
Metrics: How much is happening (aggregated measurements)
- Use for: Alerting, dashboards, capacity planning, SLOs
- Not for: Debugging individual requests
- Example: login_failures_total{reason='invalid_password'} = 42
Traces: How requests flow (distributed context)
- Use for: Latency analysis, dependency mapping, bottleneck identification
- Not for: Long-term trending, alerting
- Example: Request took 2.3s: auth(50ms) -> api(100ms) -> db(2.1s)
Decision framework:
- 'Is the system healthy?' -> Metrics (dashboards, alerts)
- 'Why is this request slow?' -> Traces (distributed tracing)
- 'What exactly happened?' -> Logs (detailed context)
- 'Is our SLO at risk?' -> Metrics (error rate, latency percentiles)
Don't try to solve every problem with logs. Use the right tool for the question you're asking.
Why
Logs at scale are expensive to store, slow to search, and hard to aggregate. Metrics and traces answer most operational questions faster and cheaper.
Context
Building observability strategy for production systems
Revisions (0)
No revisions yet.