Router Instrumentation for Datadog

Configure Apollo Router telemetry to optimize Datadog APM views


This guide explains how to configure Apollo Router telemetry instrumentation for optimal integration with Datadog APM.

Quick start

Jump to the complete configuration example for a full working configuration suitable for graphs with moderate traffic volumes. For high-traffic graphs, review the high cardinality warning to avoid metric explosion.

Understanding Datadog's attribute mapping

Datadog uses specific attributes to organize its APM views. While resource.name and operation.name are not OpenTelemetry standard attributes, Datadog synthesizes them from other span attributes if not present. Setting them explicitly gives you control over how your traces appear in Datadog APM.

Learn more:

Basic instrumentation

Start with this minimal configuration to add Datadog-specific attributes to your router spans:

YAML
router.yaml
1telemetry:
2  instrumentation:
3    spans:
4      default_attribute_requirement_level: recommended
5
6      router:
7        attributes:
8          otel.name: router
9          operation.name: "router"
10          resource.name:
11            request_method: true
12
13      supergraph:
14        attributes:
15          otel.name: supergraph
16          operation.name: "supergraph"
17          resource.name:
18            operation_name: string
19
20      subgraph:
21        attributes:
22          otel.name: subgraph
23          operation.name: "subgraph"
24          resource.name:
25            subgraph_operation_name: string
caution
High Cardinality: The operation_name and subgraph_operation_name attributes can be unbounded and high-cardinality if your GraphQL operations have many unique operation names. This affects both APM views and trace metrics, because Datadog creates metrics for each unique resource.name value. Consider using a more bounded attribute like operation_type (query/mutation/subscription) or removing the resource.name attribute and letting Datadog calculate the resource name on its own if you experience cardinality issues in Datadog.

With these attributes configured, you can filter for operations in Datadog APM:

Datadog APM showing operations set with example attributes set in router.yaml

Error tracking

Configure the error.message attribute to surface GraphQL errors and properly reflect them in Datadog APM.

note
The error tracking configuration depends on the structure of your GraphQL error responses. The example below assumes errors are returned as an array with a message field. Adjust the JSONPath expression ($[0].message) to match your specific error response format.
YAML
router.yaml
1telemetry:
2  instrumentation:
3    spans:
4      supergraph:
5        attributes:
6          # Mark span as error when GraphQL errors occur
7          otel.status_code:
8            static: ERROR
9            condition:
10              eq:
11                - true
12                - on_graphql_error: true
13          # Capture the error message from the first error in the array
14          # Adjust the JSONPath to match your error response structure
15          error.message:
16            response_errors: $[0].message
17
18      subgraph:
19        attributes:
20          otel.status_code:
21            static: ERROR
22            condition:
23              eq:
24                - true
25                - subgraph_on_graphql_error: true
26          error.message:
27            subgraph_response_errors: $[0].message

Metrics instrumentation

Add GraphQL error tracking to your metrics for more insights into your GraphQL errors and correlate them with your supergraph and subgraph spans.

YAML
router.yaml
1telemetry:
2  instrumentation:
3    instruments:
4      router:
5        http.server.request.duration:
6          attributes:
7            # Track GraphQL errors in metrics
8            graphql.errors:
9              on_graphql_error: true
10
11      subgraph:
12        http.client.request.duration:
13          attributes:
14            subgraph.name: true
15            graphql.errors:
16              subgraph_on_graphql_error: true

Complete configuration example

Here's a comprehensive router configuration optimized for Datadog:

YAML
router.yaml
1telemetry:
2  instrumentation:
3    spans:
4      default_attribute_requirement_level: recommended
5
6      router:
7        attributes:
8          otel.name: router
9          operation.name: "router"
10          resource.name:
11            request_method: true
12
13      supergraph:
14        attributes:
15          otel.name: supergraph
16          operation.name: "supergraph"
17          resource.name:
18            operation_name: string
19          # Error tracking
20          otel.status_code:
21            static: ERROR
22            condition:
23              eq:
24                - true
25                - on_graphql_error: true
26          error.message:
27            response_errors: $[0].message
28
29      subgraph:
30        attributes:
31          otel.name: subgraph
32          operation.name: "subgraph"
33          resource.name:
34            subgraph_operation_name: string
35          otel.status_code:
36            static: ERROR
37            condition:
38              eq:
39                - true
40                - subgraph_on_graphql_error: true
41          error.message:
42            subgraph_response_errors: $[0].message
43
44    instruments:
45      default_requirement_level: required
46
47      router:
48        http.server.request.duration:
49          attributes:
50            graphql.errors:
51              on_graphql_error: true
52
53      subgraph:
54        http.client.request.duration:
55          attributes:
56            subgraph.name: true
57            graphql.errors:
58              subgraph_on_graphql_error: true

Best practices

Resource naming

Choose resource.name attributes that provide meaningful grouping without high cardinality.

Operation naming

Keep operation.name consistent and low-cardinality:

  • Use static values like "router", "supergraph", "subgraph"

  • Don't include dynamic data in operation names

  • Use resource.name to provide the detailed grouping

Error handling

Ensure errors are properly tracked:

  • Set otel.status_code to ERROR for GraphQL errors

  • Include error.message with the actual error text

  • Track errors in both spans and metrics for correlation

Troubleshooting

Spans not appearing in Datadog

  • Verify service.name is set in your exporter configuration

  • Check that operation.name is being set correctly

  • Ensure the OpenTelemetry integration is installed in Datadog

High cardinality warnings

  • Review resource.name attributes for high-cardinality values

  • Avoid using unique identifiers in span attributes

  • Consider grouping operations more broadly

Next steps

Feedback

Edit on GitHub

Ask Community