patternMajorpending
Distributed tracing with OpenTelemetry
Viewed 0 times
opentelemetrydistributed tracingspanstrace contextobservability
Problem
Need to trace requests across multiple microservices to diagnose latency and errors in distributed systems.
Solution
OpenTelemetry tracing setup:
What to trace:
Key concepts:
# Python setup
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.requests import RequestsInstrumentor
from opentelemetry.instrumentation.flask import FlaskInstrumentor
# Initialize
provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(
endpoint='http://collector:4317'
))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
# Auto-instrument libraries
RequestsInstrumentor().instrument()
FlaskInstrumentor().instrument_app(app)
# Manual spans
tracer = trace.get_tracer(__name__)
def process_order(order_id):
with tracer.start_as_current_span('process_order') as span:
span.set_attribute('order.id', order_id)
with tracer.start_as_current_span('validate'):
validate(order_id)
with tracer.start_as_current_span('charge'):
charge(order_id)
span.add_event('order_completed')What to trace:
- HTTP requests (incoming and outgoing)
- Database queries
- Message queue publish/consume
- Cache operations
- External API calls
Key concepts:
- Trace: End-to-end request journey
- Span: Single operation within a trace
- Context propagation: Pass trace_id across services via headers
- Baggage: Key-value pairs propagated with context
Why
In microservices, a single user request can touch 10+ services. Without distributed tracing, diagnosing where time is spent or where errors occur is nearly impossible.
Context
Microservice architectures needing request-level observability
Revisions (0)
No revisions yet.