How to Setup and Run Jaeger With Docker and Docker Compose

Run Jaeger locally with Docker or Docker Compose: all-in-one image, OTLP ports, OpenTelemetry Python example, and Jaeger UI at port 16686.

Jaeger is an open-source distributed tracing system that helps you monitor and troubleshoot requests as they flow through a microservices architecture. In simple terms, Jaeger shows you where time is spent in a request and which services were involved.

Distributed tracing is crucial in microservices environments because it allows you to:

  • Identify performance bottlenecks
  • Debug and troubleshoot issues across services
  • Understand the flow of requests through your system
  • Optimize your application’s overall performance

Jaeger’s compatibility with OpenTelemetry — a collection of tools, APIs, and SDKs for instrumenting, generating, collecting, and exporting telemetry data — makes it an even more powerful choice for developers. This compatibility ensures that you can easily integrate Jaeger with a wide range of applications and services.

While Jaeger does add some overhead to your application, it’s designed to be lightweight. The impact can be minimized through proper sampling strategies and configuration. In most cases, the performance impact is negligible compared to the benefits of distributed tracing.

With Jaeger you can:

  • Monitor and troubleshoot distributed workflows
  • Identify performance bottlenecks
  • Track down root causes
  • Analyze service dependencies

Jaeger vs Zipkin

Jaeger and Zipkin are both open-source distributed tracing systems. Jaeger offers adaptive sampling, a scalable architecture, and strong OpenTelemetry support, which makes it a common choice for new projects.

This guide walks you through running Jaeger with Docker or Docker Compose, instrumenting an app with OpenTelemetry, and using the Jaeger UI.

Prerequisites

Setting up Jaeger (all-in-one)

Jaeger provides an all-in-one container image (jaegertracing/all-in-one) which bundles the collector, query, and in-memory storage. This is perfect for local development and demos (not production).

Confirm Docker is installed and running:

1
docker --version

Pull and run the Jaeger All-in-One image:

1
docker pull jaegertracing/all-in-one:1.57

Run the container (foreground; use -d to run in the background):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
docker run --rm --name jaeger \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 14250:14250 \
  -p 4317:4317 \
  -p 4318:4318 \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.57

To run in the background, add -d before the image name.

Run with Docker Compose

Create docker-compose.yml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
version: "3.9"
services:
  jaeger:
    image: jaegertracing/all-in-one:1.57
    container_name: jaeger
    ports:
      - "16686:16686" # Jaeger UI
      - "14268:14268" # Collector HTTP (Jaeger Thrift)
      - "14250:14250" # Collector gRPC (Jaeger)
      - "4317:4317" # OpenTelemetry gRPC
      - "4318:4318" # OpenTelemetry HTTP
      - "6831:6831/udp" # Jaeger agent (compact thrift)
      - "6832:6832/udp" # Jaeger agent (binary thrift)
      - "9411:9411" # Zipkin compatibility
    restart: unless-stopped

Start Jaeger with Docker Compose v2 (docker compose) or v1 (docker-compose):

1
docker compose up -d

This runs the all-in-one configuration: collector, query service, and in-memory storage in one process. Data is not persisted across restarts; use a proper storage backend for production.

  • Open the Jaeger UI at http://localhost:16686 (or http://<server-ip>:16686 if remote). Allow port 16686 in your firewall if you need external access.
  • Send test spans using the OpenTelemetry example below or your own instrumented app.

Jaeger User Interface

Open http://localhost:16686 in your browser. The UI includes:

  • Service Map: A visual representation of the services and their interactions.
  • Tracing: A detailed view of individual traces, including the request and response data.
  • Span: A detailed view of individual spans, including the request and response data.
  • Query: A search interface for querying tracing data.

Tip: After you send a few spans, select your service in the dropdown and click Find Traces.

Ports exposed by the Jaeger all-in-one container

The all-in-one container exposes several ports. The most commonly used ones are:

  • 16686/tcp: Jaeger UI (Query service)
  • 4317/tcp: OTLP gRPC receiver (OpenTelemetry)
  • 4318/tcp: OTLP HTTP receiver (OpenTelemetry)
  • 14268/tcp: Jaeger collector HTTP (Thrift)
  • 14250/tcp: Jaeger collector gRPC
  • 6831/udp and 6832/udp: Jaeger agent receivers (legacy Jaeger clients)
  • 9411/tcp: Zipkin compatibility endpoint

If you’re using OpenTelemetry, you’ll typically send spans to 4317 or 4318 and view traces in the UI on 16686.

Jaeger Architecture

Jaeger’s architecture consists of several key components:

  • Client Libraries: These libraries are used to instrument your application code. They create spans and send them to the Jaeger Agent.
  • Agent: A network daemon that listens for spans sent by the client libraries. It batches and sends them to the Collector.
  • Collector: Receives traces from the Agent and runs them through a processing pipeline. It then stores them in a storage backend.
  • Query: A service that retrieves traces from storage and hosts a UI to display them.
  • UI: A web interface for searching and analyzing traces.

Data flows from your instrumented application through the client libraries to the Agent, then to the Collector, and finally to storage. The Query service retrieves this data from storage to display in the UI.

Jaeger supports multiple storage options, including Cassandra, Elasticsearch, and in-memory storage (for development). The choice of storage depends on your scalability needs and existing infrastructure.

Sampling plays a crucial role in Jaeger’s architecture. It allows you to control the amount of tracing data you collect, which is essential for managing performance and storage costs in high-traffic systems.

To instrument your application, prefer using the OpenTelemetry SDK and export spans to Jaeger via OTLP.

Example (Python) exporting spans to Jaeger over OTLP gRPC (localhost:4317):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

resource = Resource.create({"service.name": "my-service"})
provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter(endpoint="http://localhost:4317", insecure=True)))
trace.set_tracer_provider(provider)

tracer = trace.get_tracer(__name__)

with tracer.start_as_current_span("TestSpan") as span:
    span.set_attribute("hello", "world")
    # Your code here

You should then see my-service in the Jaeger UI.

Install dependencies:

1
pip install opentelemetry-sdk opentelemetry-exporter-otlp-proto-grpc

Advanced Instrumentation Techniques

Advanced techniques you can use with Jaeger:

  • Custom Samplers: Create samplers that make intelligent decisions about which traces to sample based on your specific needs.
  • Baggage: Use baggage to pass data along the entire trace, which can be useful for correlating information across services.
  • Multiple Spans: Create and manage multiple spans within a single trace to represent different operations or sub-operations.
  • Logging Integration: Integrate Jaeger with your logging system to enhance debugging capabilities.

Jaeger in a Production Environment

When deploying Jaeger in production, consider the following:

  • Scalability: Each component (Agent, Collector, Query) can be scaled independently. Use load balancers to distribute traffic.
  • Storage: Implement a production-ready storage backend like Elasticsearch or Cassandra. Ensure proper sizing and configuration for your expected data volume.
  • Security: Set up secure communication between Jaeger components using TLS. Implement authentication and authorization for the Jaeger UI.
  • Sampling: Implement appropriate sampling strategies based on your traffic patterns and tracing needs. Dynamic sampling can help balance data collection and system performance.

Analyzing Traces with Jaeger UI

The Jaeger UI provides powerful tools for analyzing traces:

  • Use the search functionality to find relevant traces based on service, operation, tags, or duration.
  • Examine the trace timeline to understand the relationships between spans and identify long-running operations.
  • Inspect span details, including tags and logs, to gather context about each operation.
  • Use the comparison view to analyze multiple traces side by side and identify patterns or anomalies.

Best Practices for Effective Jaeger Implementation

To get the most out of Jaeger:

  • Follow consistent naming conventions for services and operations to make searching and filtering easier.
  • Implement appropriate sampling strategies to balance data collection and system performance.
  • Ensure proper error handling and logging within instrumented code to provide context for issues.
  • Regularly review and optimize your tracing implementation to ensure it continues to meet your needs as your system evolves.

Service Performance Monitoring (SPM) with Jaeger

Service Performance Monitoring (SPM) is a feature in Jaeger that provides service-level and operation-level aggregation of key metrics like request rates, error rates, and durations (latencies). It helps identify interesting traces (e.g. high QPS, slow or erroneous requests) without needing to know the service or operation names up-front.

How does SPM work?

SPM works by aggregating span data from Jaeger to produce RED (Request, Error, Duration) metrics. The OpenTelemetry Collector’s Span Metrics Processor is used to generate these metrics from the incoming traces.

Key features of SPM

  • Service-level and operation-level aggregation of request rates, error rates, and latencies (P95, P75, P50)
  • “Impact” metric computed as the product of latency and request rate to highlight high-impact operations
  • Pre-populated Trace search with relevant service, operation and lookback period for interesting traces

Accessing SPM in Jaeger

The SPM feature can be accessed from the “Monitor” tab in the Jaeger UI. It requires Jaeger 1.57 or later.

Limitations

SPM is still an experimental feature and may have further changes in the future.

It typically requires an OpenTelemetry Collector configured with the span metrics processor, and a metrics backend to store/query the generated metrics.

Verifying the setup

  • Container: docker ps (or docker compose ps) should show the Jaeger container with ports 16686, 4317, 4318, etc. mapped.
  • UI: Opening http://localhost:16686 should load the Jaeger UI; the Service dropdown may be empty until you send traces.
  • Traces: After running the Python example (or any OTLP sender), select your service name (e.g. my-service) and click Find Traces to see spans.

Summary

You ran Jaeger using the all-in-one Docker image (standalone docker run or Docker Compose), with the UI on port 16686 and OTLP receivers on 4317 (gRPC) and 4318 (HTTP). You instrumented a small Python app with the OpenTelemetry SDK and exported spans to Jaeger. For production, deploy the collector, query, and storage as separate components and use a persistent backend (e.g. Elasticsearch, Cassandra).

For more monitoring stack options, see

comments powered by Disqus
Citizix Ltd
Built with Hugo
Theme Stack designed by Jimmy