Docs
Try Apollo Studio

Tracing in the Apollo Router

Collect tracing information


The Apollo Router supports OpenTelemetry, with exporters for:

The Apollo Router generates spans that include the various phases of serving a request and associated dependencies. This is useful for showing how response time is affected by:

  • Sub-request response times
  • Query shape (sub-request dependencies)
  • Apollo Router post-processing

Span data is sent to a collector such as Jaeger, which can assemble spans into a gantt chart for analysis.

To get the most out of distributed tracing, all components in your system should be instrumented.

Common configuration

Trace config

In your router's YAML config file, the trace_config section contains common configuration that's used by all exporters. This section is optional, and it falls back on the values of environment variables specified by the OpenTelemetry spec if service_name is not set.

router.yaml
telemetry:
tracing:
trace_config:
service_name: "router"
service_namespace: "apollo"
# Optional. Either a float between 0 and 1 or 'always_on' or 'always_off'
sampler: 0.1
# Optional. Use a parent based sampler. This enables remote spans help make a decision on if a span is sampeld or not.
# https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#parentbased
parent_based_sampler: false
# Optional limits
max_attributes_per_event: 10
max_attributes_per_link: 10
max_attributes_per_span: 10
max_events_per_span: 10
max_links_per_span: 10
# Attributes particular to an exporter that have not
# been explicitly handled in Router configuration.
attributes:
some.config.attribute: "config value"

If service_name is set, then environment variables are not used. However, you can embed environment variables into your router config using Unix ${key:default} syntax.

If no environment variable is set and service_name is not present then router is used as the default service name.

Propagation

The propagation section allows you to configure which propagators are active in addition to those automatically activated by using an exporter.

router.yaml
telemetry:
tracing:
propagation:
# https://www.w3.org/TR/baggage/
baggage: false
# https://www.datadoghq.com/
datadog: false
# https://www.jaegertracing.io/ (compliant with opentracing)
jaeger: false
# https://www.w3.org/TR/trace-context/
trace_context: false
# https://zipkin.io/ (compliant with opentracing)
zipkin: false
# If you have your own way to generate a trace id and you want to pass it via a custom request header
request:
header_name: my-trace-id

Specifying explicit propagation is generally only required if you're using an exporter that supports multiple trace ID formats (e.g., OpenTelemetry Collector, Jaeger, or OpenTracing compatible exporters).

Trace ID

This is part of an experimental feature, it means any time until it's stabilized (without the prefix experimental_) we might change the configuration shape or adding/removing features. If you want to give feedback or participate in that feature feel free to join this discussion on GitHub.

If you want to expose in response headers the generated trace ID or the one you provided using propagation headers you can use this configuration:

router.yaml
telemetry:
tracing:
experimental_response_trace_id:
enabled: true # default: false
header_name: "my-trace-id" # default: "apollo-trace-id"

Using this configuration you will have a response header called my-trace-id containing the trace ID. It could help you to debug a specific query if you want to grep your log with this trace id to have more context.

Batch Processor

All trace exporters (apollo|datadog|zipkin|jaeger|otlp) have batch span processor configuration, it will be necessary to tune this if you see the following in your logs:

OpenTelemetry trace error occurred: cannot send span to the batch span processor because the channel is full

  • scheduled_delay The delay from receiving the first span to the batch being sent.
  • max_concurrent_exports The maximum number of overlapping export requests. For instance if ingest is taking a long time to respond there may be several overlapping export requests.
  • max_export_batch_size The number of spans to include in a batch. Your ingest may have max message size limits.
  • max_export_timeout The timeout for sending spans before dropping the data.
  • max_queue_size The maximum number of spans to be buffered before dropping span data.
router.yaml
telemetry:
# Apollo tracing
apollo:
batch_processor:
scheduled_delay: 100ms
max_concurrent_exports: 1000
max_export_batch_size: 10000
max_export_timeout: 100s
max_queue_size: 10000
tracing:
# Datadog
datadog:
batch_processor:
scheduled_delay: 100ms
max_concurrent_exports: 1000
max_export_batch_size: 10000
max_export_timeout: 100s
max_queue_size: 10000
endpoint: default
# Jaeger
# Otlp
# Zipkin

You will need to experiment to find the setting that are appropriate for your use case.

Using Datadog

The Apollo Router can be configured to connect to either the default agent address or a URL.

router.yaml
telemetry:
tracing:
datadog:
# Either 'default' or a URL
endpoint: default

Using Jaeger

The Apollo Router can be configured to export tracing data to Jaeger either via an agent or http collector.

Agent config

router.yaml
telemetry:
tracing:
jaeger:
agent:
# Either 'default' or a URL
endpoint: docker_jaeger:14268

Collector config

If you're using Kubernetes, you can inject your secrets into configuration via environment variables:

router.yaml
telemetry:
tracing:
jaeger:
collector:
endpoint: "http://my-jaeger-collector"
username: "${env.JAEGER_USERNAME}"
password: "${env.JAEGER_PASSWORD}"

OpenTelemetry Collector via OTLP

OpenTelemetry Collector is a horizontally scalable collector that you can use to receive, process, and export your telemetry data in a pluggable way.

If you find that the built-in telemetry features of the Apollo Router are missing some desired functionality (e.g., exporting to Kafka), then it's worth considering this option.

router.yaml
telemetry:
tracing:
otlp:
# Either 'default' or a URL
endpoint: default
# Optional protocol (Defaults to grpc)
protocol: grpc
# Optional Grpc configuration
grpc:
domain_name: "my.domain"
key: ""
ca: ""
cert: ""
metadata:
foo: bar
# Optional Http configuration
http:
headers:
foo: bar

Remember that file. and env. prefixes can be used for expansion in config yaml. e.g. ${file.ca.txt}.

Using Zipkin

The Apollo Router can be configured to export tracing data to either the default collector address or a URL:

router.yaml
telemetry:
tracing:
zipkin:
# Either 'default' or a URL
endpoint: http://my_zipkin_collector.dev
Edit on GitHub
Previous
Collecting metrics
Next
Overview