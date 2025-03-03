Response Cache Observability
Monitor with telemetry and debug with the cache debugger
Response caching provides comprehensive observability through metrics, traces, and logs. You can also use the cache debugger in Apollo Sandbox to understand cache behavior during development.
Metrics
Instruments
The router provides the
telemetry.instrumentation.instruments.cache instrument to enable cache metrics:
1telemetry:
2 instrumentation:
3 instruments:
4 cache: # Cache instruments configuration
5 apollo.router.operations.response.cache: # A counter which counts the number of cache hit and miss for subgraph requests
6 attributes:
7 graphql.type.name: true # Include the entity type name. default: false
8 subgraph.name: # Custom attributes to include the subgraph name in the metric
9 subgraph_name: true
10 supergraph.operation.name: # Add custom attribute to display the supergraph operation name
11 supergraph_operation_name: string
12 # You can add more custom attributes using subgraph selectors
You can use custom instruments to create metrics for the subgraph service. The following example creates a custom instrument to generate a histogram that measures the subgraph request duration when there's at least one cache hit for the "inventory" subgraph:
1telemetry:
2 instrumentation:
3 instruments:
4 subgraph:
5 only_cache_hit_on_subgraph_inventory:
6 type: histogram
7 value: duration
8 unit: hit
9 description: histogram of subgraph request duration when we have cache hit on subgraph inventory
10 condition:
11 all:
12 - eq:
13 - subgraph_name: true # subgraph selector
14 - inventory
15 - gt: # If the number of cache hit is greater than 0
16 - response_cache: hit
17 # entity_type: Product # Here you could also only check for the entity type Product, it's `all` by default if we don't specify this config.
18 - 0
19
Fetch/insert
|Name
|Description
|Unit
apollo.router.operations.response_cache.fetch.error
|Errors when fetching data from cache
{error}
apollo.router.operations.response_cache.fetch
|Time to fetch data from cache
s
apollo.router.operations.response_cache.fetch.entity
|Number of entities per subgraph fetch node
{entity}
apollo.router.operations.response_cache.insert.error
|Errors when inserting data in cache
{error}
apollo.router.operations.response_cache.insert
|Time to insert new data in cache
s
Invalidation
|Name
|Description
|Unit
apollo.router.operations.response_cache.invalidation.event
|Response cache received a batch of invalidation requests
{request}
apollo.router.operations.response_cache.invalidation.error
|Errors when invalidating data in cache
{error}
apollo.router.operations.response_cache.invalidation.entry
|Response cache counter for invalidated entries
{entry}
apollo.router.operations.response_cache.invalidation.request.entry
|Number of invalidated entries per invalidation request.
{entry}
apollo.router.operations.response_cache.invalidation.duration
|Duration of the invalidation event execution, in seconds.
s
Internal
|Name
|Description
|Unit
apollo.router.response_cache.reconnection
|Number of reconnections to the cache storage
{retry}
apollo.router.response_cache.private_queries.lru.size
|LRU cache size for private queries fetched
{query}
Redis
The latency metrics are marked as experimental because Apollo might change them if there is an upstream change in one of our dependencies.
Connection and performance metrics
apollo.router.cache.redis.connections: Number of active Redis connections
apollo.router.cache.redis.command_queue_length: Commands waiting to be sent to Redis
apollo.router.cache.redis.commands_executed: Total number of Redis commands executed
apollo.router.cache.redis.redelivery_count: Commands retried due to connection issues
apollo.router.cache.redis.errors: Redis errors by type (auth, timeout, io, etc.)
Experimental Redis performance metrics
experimental.apollo.router.cache.redis.network_latency_avg: Average network latency to Redis
experimental.apollo.router.cache.redis.latency_avg: Average Redis command execution time
experimental.apollo.router.cache.redis.request_size_avg: Average request payload size
experimental.apollo.router.cache.redis.response_size_avg: Average response payload size
Traces
If you're looking at a trace when you have cache hits it looks like this:
The
response_cache.lookup span shows how much time was spent fetching data from the cache.
The
response_cache.store span shows how much time was spent inserting data into the cache.
For invalidation, look for the
invalidation_endpoint span.
Available attributes on
response_cache.lookup:
kind:
rootor
entity. Indicates whether the cache lookup is for a root field or an entity.
subgraph.name: The subgraph name
graphql.type: The type (or parent type for root fields)
debug: Boolean indicating whether debug mode is enabled
private: Boolean indicating whether the data is private
contains_private_id: Boolean indicating whether a private ID was found in the context
cache.key: The primary cache key
cache.status:
hit|
partial_hit|
miss
Available attributes on
response_cache.store:
kind: Either
rootor
entityindicating whether data is for root fields or an entity
subgraph.name: The subgraph name
ttl: The TTL of this cache entry
batch.size: The size of the batch when inserting entities (entities are often batched)
Logs
The router supports a
response_cache selector in telemetry for the subgraph service. The selector returns either the number of cache hits or misses by an entity for a subgraph request or the cache status (
hit|
partial_hit|
miss) for a subgraph request.
For example, display a log containing all subgraph response data that's not cached:
1telemetry:
2 instrumentation:
3 events:
4 subgraph:
5 response:
6 level: info
7 condition:
8 all:
9 - eq: # Only for subgraph posts
10 - subgraph_name: true
11 - static: posts
12 - eq: # If there's no cache hit in this subgraph response
13 - response_cache: hit
14 - 0
Cache debugger
The cache debugger in Apollo Sandbox helps you understand cache behavior during development.
To use it, run the router with the following minimal configuration:
1supergraph:
2 introspection: true
3 path: /
4 listen: 0.0.0.0:4000
5homepage:
6 enabled: false
7sandbox: # Enabled sandbox
8 enabled: true
9# Enable response caching globally
10preview_response_cache:
11 enabled: true
12 debug: true # Enable debugging data for the cache debugger. Don't enable this in production.
13 invalidation:
14 listen: 0.0.0.0:4000
15 path: /invalidation
16 subgraph:
17 all:
18 enabled: true
19 # Configure Redis for all subgraphs
20 redis:
21 urls: ["redis://localhost:6379"]
22 invalidation:
23 enabled: true
24 shared_key: ${env.INVALIDATION_SHARED_KEY} # Use environment variable INVALIDATION_SHARED_KEY
Go to your router instance at its root URL to see Apollo Sandbox:
In Sandbox, enable the cache debugger. Click the settings button in the top left and scroll to the bottom to enable it:
In the right panel, open the dropdown at the top of the response data panel and select Cache debugger:
A list of cached or potentially cached entries appears. This list helps you understand the cache status of your data:
If the
Created atcolumn contains data, the value has been stored in the cache
If the
sourcecolumn is
products, the data for this call was fetched from the
productssubgraph, even if it is now cached
If the
Created atcolumn is empty, the entry hasn't been cached. This might happen for multiple reasons (see Troubleshoot). In this example, the
accountssubgraph entry isn't cached because it contains private, uncacheable data.
Click any entry to see details about it, including:
The
Cache-Controlheader value returned by the subgraph
Response data from this entry
The entity key
Corresponding cache tags for invalidation
View the request pane for details about the original request sent to the subgraph to get the data, including the query and variables.
Generate a
curlcommand to invalidate specific data by clicking the Invalidate button, which opens a modal showing what you want to invalidate.
Troubleshoot
Common reasons for cache misses
Your origin doesn't return a
Cache-Controlheader or returns it with the
no-storedirective.
Your origin returns an
Ageheader with a value greater than the
max-agein the
Cache-Controlheader, or the default TTL from your router configuration if
max-ageisn't set.
Your origin returns a
Cache-Controlheader with the
privatedirective, but you haven't configured a
private_idin your response cache configuration. Private data requires a
private_idto differentiate cache entries between users.
You disabled response caching in the router configuration for a specific subgraph or all subgraphs.
Redis is unavailable or times out. Use the metrics, traces, and logs described earlier in this page to measure errors.