Query Plan Caching
Configure in-memory and distributed caching for query plans
Whenever your router receives an incoming GraphQL operation, it generates a query plan to determine which subgraphs it needs to query to resolve that operation.
By caching previously generated query plans, your router can skip generating them again if a client later sends the exact same operation—improving your router's responsiveness.
Performance improvements vs. stability
The router is a highly scalable and low-latency runtime. Even with all caching disabled, the time to process operations and query plans is minimal (nanoseconds to milliseconds) when compared to the overall supergraph request, except in edge cases of extremely large operations and supergraphs.
Caching offers stability to those running a large graph so your overhead for given operations stays consistent, not that it dramatically improves. To validate the performance wins of operation caching, check out the traces and metrics in the router to take measurements before and after.
In extremely large edge cases though, the cache can save 2-10x time to create the query plan, which is still a small part of the overall request.
In-memory caching
GraphOS Router enables query plan caching by default using an in-memory LRU cache. In your router's YAML config file, you can configure the maximum number of query plan entries in the cache:
1supergraph:
2 query_planning:
3 cache:
4 in_memory:
5 limit: 512 # This is the default value.Cache warm-up
When loading a new schema, a query plan might change for some queries, so cached query plans cannot be reused.
To prevent increased latency upon query plan cache invalidation, the router precomputes query plans for the most used queries from the cache when a new schema is loaded.
Precomputed plans are cached before the router switches traffic over to the new schema.
Apollo-Expose-Query-Plan: dry-run for generating query plans at runtime, which you can use to warm up your cache instances with a custom-defined operation list.By default, the router warms up the cache with 30% of the queries already in cache, but you can configure it as follows:
1supergraph:
2 query_planning:
3 # Pre-plan the 100 most used operations when the supergraph changes
4 warmed_up_queries: 100In addition, the router can use the contents of the persisted query list to prewarm the cache. By default, it does this when loading a new schema but not on startup; you can configure it to change either of these defaults.
Cache warm-up with headers
With router v1.61.0+ and v2.x+, if you have enabled exposing query plans via --dev mode or plugins.experimental.expose_query_plan: true, you can pass the Apollo-Expose-Query-Plan header to return query plans in the GraphQL response extensions. You must set the header to one of the following values:
true: Returns a human-readable string and JSON blob of the query plan while still executing the query to fetch data.dry-run: Generates the query plan and aborts without executing the query.
After using dry-run, query plans are saved to your configured cache locations. Using real, mirrored, or similar-to-production operations is a great way to warm up the caches before transitioning traffic to new router instances.
Monitoring cache performance
To get more information on the planning and warm-up process, use the following metrics (where <storage> can be redis for distributed cache or memory):
Counters
apollo.router.cache.hit.time.count{kind="query planner", storage="<storage>"}apollo.router.cache.miss.time.count{kind="query planner", storage="<storage>"}
Histograms
apollo.router.query_planning.plan.duration: time spent planning queriesplanner: The query planner implementation used (rustorjs)outcome: The outcome of the query planning process (success,timeout,cancelled,error)
apollo.router.schema.load.duration: time spent loading a schemaapollo.router.cache.hit.time{kind="query planner", storage="<storage>"}: time to get a value from the cacheapollo.router.cache.miss.time{kind="query planner", storage="<storage>"}
Gauges
apollo.router.cache.size{kind="query planner", storage="memory"}: current size of the cache (only for in-memory cache)apollo.router.cache.storage.estimated_size{kind="query planner", storage="memory"}: estimated storage size of the cache (only for in-memory query planner cache)
To define the right size of the in-memory cache, monitor apollo.router.cache.size and the cache hit rate. Then examine apollo.router.schema.load.duration and apollo.router.query_planning.plan.duration to decide how much time to spend warming up queries.
Distributed caching with Redis
If you have multiple GraphOS Router instances, those instances can share a Redis-backed cache for their query plans. This means that if any of your router instances caches a particular value, all of your instances can look up that value to significantly improve responsiveness.
Prerequisites
To use distributed caching:
You must have a Redis cluster (or single instance) that your router instances can communicate with.
You must have a GraphOS Enterprise plan and connect your router to GraphOS.
How it works
Whenever a router instance requires a query plan to resolve a client operation:
The router instance checks its own in-memory cache for the required value and uses it if found.
If not found, the router instance then checks the distributed Redis cache for the required value and uses it if found. It also then replicates the found value in its own in-memory cache.
If not found, the router instance generates the required query plan.
The router instance stores the obtained value in both the distributed cache and its in-memory cache.
Redis URL configuration
The distributed caching configuration must contain one or more URLs using different schemes depending on the expected deployment:
redis— TCP connected to a centralized server.rediss— TLS connected to a centralized server.redis-cluster— TCP connected to a cluster.rediss-cluster— TLS connected to a cluster.redis-sentinel— TCP connected to a centralized server behind a sentinel layer.rediss-sentinel— TLS connected to a centralized server behind a sentinel layer.
The URLs must have the following format:
One node
1redis|rediss :// [[username:]password@] host [:port][/database]Example: redis://localhost:6379
Clustered
1redis|rediss[-cluster] :// [[username:]password@] host [:port][?[node=host1:port1][&node=host2:port2][&node=hostN:portN]]or, if configured with multiple URLs:
1[
2 "redis|rediss[-cluster] :// [[username:]password@] host [:port]",
3 "redis|rediss[-cluster] :// [[username:]password@] host1 [:port1]",
4 "redis|rediss[-cluster] :// [[username:]password@] host2 [:port2]"
5]Sentinel
1redis|rediss[-sentinel] :// [[username1:]password1@] host [:port][/database][?[node=host1:port1][&node=host2:port2][&node=hostN:portN]
2 [&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]or, if configured with multiple URLs:
1[
2 "redis|rediss[-sentinel] :// [[username:]password@] host [:port][/database][?[&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]",
3 "redis|rediss[-sentinel] :// [[username1:]password1@] host [:port][/database][?[&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]"
4]Router configuration
To enable distributed caching of query plans, add the following to your router's YAML config file:
1supergraph:
2 query_planning:
3 cache:
4 redis:
5 urls: ["redis://..."]The value of urls is a list of URLs for all Redis instances in your cluster.
All query plan cache entries will be prefixed with plan. within the distributed cache.
Redis configuration options
1supergraph:
2 query_planning:
3 cache:
4 redis:
5 urls: ["redis://..."]
6 username: admin/123 # Optional, can be part of the urls directly, mainly useful if you have special character like '/' in your password that doesn't work in url. This field takes precedence over the username in the URL
7 password: admin # Optional, can be part of the urls directly, mainly useful if you have special character like '/' in your password that doesn't work in url. This field takes precedence over the password in the URL
8 timeout: 2s # Optional, by default: 500ms
9 ttl: 24h # Optional
10 namespace: "prefix" # Optional
11 #tls:
12 required_to_start: false # Optional, defaults to false
13 reset_ttl: true # Optional, defaults to true
14 pool_size: 4 # Optional, defaults to 1Timeout
Connecting and sending commands to Redis have a timeout of 500ms by default, which you can override.
TTL
The ttl option defines the default global expiration for Redis entries. For query plan caching, the default expiration is set to 30 days.
When enabling distributed caching, consider how frequently you publish new schemas and configure the TTL accordingly. When new schemas are published, the router pre-warms the in-memory and distributed caches but doesn't invalidate existing cached query plans in the distributed cache, creating an additive effect on cache utilization.
To prevent cache overflow, consider decreasing the TTL to 24 hours or twice the median publish interval (whichever's lesser), and monitor cache utilization in your environment, especially during schema publish events.
Also note that when cache warm-up is enabled, each router instance will warm the distributed cache with query plans from its own in-memory cache. In the worst case, a schema publish will increase the number of query plans in the distributed cache by the number of router instances multiplied by the number of warmed-up queries per instance, which may noticeably increase the total cache utilization.
Namespace
When using the same Redis instance for multiple purposes, the namespace option defines a prefix for all the keys defined by the router.
TLS
For Redis TLS connections, you can set up a client certificate or override the root certificate authority by configuring tls in your router's YAML config file. For example:
1supergraph:
2 query_planning:
3 cache:
4 redis:
5 urls: ["rediss://redis.example.com:6379"]
6 tls:
7 certificate_authorities: ${file./path/to/ca.crt}
8 client_authentication:
9 certificate_chain: ${file./path/to/certificate_chain.pem}
10 key: ${file./path/to/key.pem}Required to start
When active, the required_to_start option will prevent the router from starting if it cannot connect to Redis. By default, the router will still start without a connection to Redis, which would result in only using the in-memory cache for query planning.
Reset TTL
When this option is active, accessing a cache entry in Redis will reset its expiration.
Pool size
The pool_size option defines the number of connections to Redis that the router will open. By default, the router will open a single connection to Redis. If there is a lot of traffic between router and Redis and/or there is some latency in those requests, it is recommended to increase the pool size to reduce that latency.
Cache warm-up with distributed caching
If the router uses distributed caching for query plans, the warm-up phase also stores the new query plans in Redis. Since all router instances might have the same distributions of queries in their in-memory cache, the list of queries is shuffled before warm-up, so each router instance can plan queries in a different order and share their results through the cache.