Query Plan Caching

Configure in-memory and distributed caching for query plans


Whenever your router receives an incoming GraphQL operation, it generates a query plan to determine which subgraphs it needs to query to resolve that operation.

By caching previously generated query plans, your router can skip generating them again if a client later sends the exact same operation—improving your router's responsiveness.

Performance improvements vs. stability

The router is a highly scalable and low-latency runtime. Even with all caching disabled, the time to process operations and query plans is minimal (nanoseconds to milliseconds) when compared to the overall supergraph request, except in edge cases of extremely large operations and supergraphs.

Caching offers stability to those running a large graph so your overhead for given operations stays consistent, not that it dramatically improves. To validate the performance wins of operation caching, check out the traces and metrics in the router to take measurements before and after.

In extremely large edge cases though, the cache can save 2-10x time to create the query plan, which is still a small part of the overall request.

In-memory caching

GraphOS Router enables query plan caching by default using an in-memory LRU cache. In your router's YAML config file, you can configure the maximum number of query plan entries in the cache:

YAML
router.yaml
1supergraph:
2  query_planning:
3    cache:
4      in_memory:
5        limit: 512 # This is the default value.

Cache warm-up

When loading a new schema, a query plan might change for some queries, so cached query plans cannot be reused.

To prevent increased latency upon query plan cache invalidation, the router precomputes query plans for the most used queries from the cache when a new schema is loaded.

Precomputed plans are cached before the router switches traffic over to the new schema.

tip
You can also send the header Apollo-Expose-Query-Plan: dry-run for generating query plans at runtime, which you can use to warm up your cache instances with a custom-defined operation list.

By default, the router warms up the cache with 30% of the queries already in cache, but you can configure it as follows:

YAML
router.yaml
1supergraph:
2  query_planning:
3    # Pre-plan the 100 most used operations when the supergraph changes
4    warmed_up_queries: 100

In addition, the router can use the contents of the persisted query list to prewarm the cache. By default, it does this when loading a new schema but not on startup; you can configure it to change either of these defaults.

Cache warm-up with headers

Requires ≥ Router v1.61.0

With router v1.61.0+ and v2.x+, if you have enabled exposing query plans via --dev mode or plugins.experimental.expose_query_plan: true, you can pass the Apollo-Expose-Query-Plan header to return query plans in the GraphQL response extensions. You must set the header to one of the following values:

  • true: Returns a human-readable string and JSON blob of the query plan while still executing the query to fetch data.

  • dry-run: Generates the query plan and aborts without executing the query.

After using dry-run, query plans are saved to your configured cache locations. Using real, mirrored, or similar-to-production operations is a great way to warm up the caches before transitioning traffic to new router instances.

Monitoring cache performance

To get more information on the planning and warm-up process, use the following metrics (where <storage> can be redis for distributed cache or memory):

Counters

  • apollo.router.cache.hit.time.count{kind="query planner", storage="<storage>"}

  • apollo.router.cache.miss.time.count{kind="query planner", storage="<storage>"}

Histograms

  • apollo.router.query_planning.plan.duration: time spent planning queries

    • planner: The query planner implementation used (rust or js)

    • outcome: The outcome of the query planning process (success, timeout, cancelled, error)

  • apollo.router.schema.load.duration: time spent loading a schema

  • apollo.router.cache.hit.time{kind="query planner", storage="<storage>"}: time to get a value from the cache

  • apollo.router.cache.miss.time{kind="query planner", storage="<storage>"}

Gauges

  • apollo.router.cache.size{kind="query planner", storage="memory"}: current size of the cache (only for in-memory cache)

  • apollo.router.cache.storage.estimated_size{kind="query planner", storage="memory"}: estimated storage size of the cache (only for in-memory query planner cache)

To define the right size of the in-memory cache, monitor apollo.router.cache.size and the cache hit rate. Then examine apollo.router.schema.load.duration and apollo.router.query_planning.plan.duration to decide how much time to spend warming up queries.

Distributed caching with Redis

PLAN REQUIRED
This feature is available on the following GraphOS plans: Free, Developer, Standard, Enterprise.
Rate limits apply on the Free plan. Performance pricing applies on Developer and Standard plans. Developer and Standard plans require Router v2.6.0 or later.

If you have multiple GraphOS Router instances, those instances can share a Redis-backed cache for their query plans. This means that if any of your router instances caches a particular value, all of your instances can look up that value to significantly improve responsiveness.

Prerequisites

To use distributed caching:

How it works

Whenever a router instance requires a query plan to resolve a client operation:

  1. The router instance checks its own in-memory cache for the required value and uses it if found.

  2. If not found, the router instance then checks the distributed Redis cache for the required value and uses it if found. It also then replicates the found value in its own in-memory cache.

  3. If not found, the router instance generates the required query plan.

  4. The router instance stores the obtained value in both the distributed cache and its in-memory cache.

Redis URL configuration

The distributed caching configuration must contain one or more URLs using different schemes depending on the expected deployment:

  • redis — TCP connected to a centralized server.

  • rediss — TLS connected to a centralized server.

  • redis-cluster — TCP connected to a cluster.

  • rediss-cluster — TLS connected to a cluster.

  • redis-sentinel — TCP connected to a centralized server behind a sentinel layer.

  • rediss-sentinel — TLS connected to a centralized server behind a sentinel layer.

The URLs must have the following format:

One node

Text
1redis|rediss :// [[username:]password@] host [:port][/database]

Example: redis://localhost:6379

Clustered

Text
1redis|rediss[-cluster] :// [[username:]password@] host [:port][?[node=host1:port1][&node=host2:port2][&node=hostN:portN]]

or, if configured with multiple URLs:

Text
1[
2  "redis|rediss[-cluster] :// [[username:]password@] host [:port]",
3  "redis|rediss[-cluster] :// [[username:]password@] host1 [:port1]",
4  "redis|rediss[-cluster] :// [[username:]password@] host2 [:port2]"
5]

Sentinel

Text
1redis|rediss[-sentinel] :// [[username1:]password1@] host [:port][/database][?[node=host1:port1][&node=host2:port2][&node=hostN:portN]
2                            [&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]

or, if configured with multiple URLs:

Text
1[
2  "redis|rediss[-sentinel] :// [[username:]password@] host [:port][/database][?[&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]",
3  "redis|rediss[-sentinel] :// [[username1:]password1@] host [:port][/database][?[&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]"
4]

Router configuration

tip
In your router's YAML config file, you should specify your Redis URLs via environment variables and variable expansion. This prevents your Redis URLs from being committed to version control, which is especially dangerous if they include authentication information like a username and/or password.
caution
Cached query plans are not evicted on schema refresh, which can quickly lead to distributed cache overflow when combined with cache warm-up and frequent schema publishes.Test your cache configuration with expected queries and consider decreasing the TTL to prevent cache overflow.

To enable distributed caching of query plans, add the following to your router's YAML config file:

YAML
router.yaml
1supergraph:
2  query_planning:
3    cache:
4      redis:
5        urls: ["redis://..."]

The value of urls is a list of URLs for all Redis instances in your cluster.

All query plan cache entries will be prefixed with plan. within the distributed cache.

Redis configuration options

YAML
router.yaml
1supergraph:
2  query_planning:
3    cache:
4      redis:
5        urls: ["redis://..."]
6        username: admin/123 # Optional, can be part of the urls directly, mainly useful if you have special character like '/' in your password that doesn't work in url. This field takes precedence over the username in the URL
7        password: admin # Optional, can be part of the urls directly, mainly useful if you have special character like '/' in your password that doesn't work in url. This field takes precedence over the password in the URL
8        timeout: 2s # Optional, by default: 500ms
9        ttl: 24h # Optional
10        namespace: "prefix" # Optional
11        #tls:
12        required_to_start: false # Optional, defaults to false
13        reset_ttl: true # Optional, defaults to true
14        pool_size: 4 # Optional, defaults to 1

Timeout

Connecting and sending commands to Redis have a timeout of 500ms by default, which you can override.

TTL

The ttl option defines the default global expiration for Redis entries. For query plan caching, the default expiration is set to 30 days.

When enabling distributed caching, consider how frequently you publish new schemas and configure the TTL accordingly. When new schemas are published, the router pre-warms the in-memory and distributed caches but doesn't invalidate existing cached query plans in the distributed cache, creating an additive effect on cache utilization.

To prevent cache overflow, consider decreasing the TTL to 24 hours or twice the median publish interval (whichever's lesser), and monitor cache utilization in your environment, especially during schema publish events.

Also note that when cache warm-up is enabled, each router instance will warm the distributed cache with query plans from its own in-memory cache. In the worst case, a schema publish will increase the number of query plans in the distributed cache by the number of router instances multiplied by the number of warmed-up queries per instance, which may noticeably increase the total cache utilization.

tip
Be sure to test your configuration with expected queries and during schema publish events to understand the impact of distributed caching on cache utilization.

Namespace

When using the same Redis instance for multiple purposes, the namespace option defines a prefix for all the keys defined by the router.

TLS

For Redis TLS connections, you can set up a client certificate or override the root certificate authority by configuring tls in your router's YAML config file. For example:

YAML
1supergraph:
2  query_planning:
3    cache:
4      redis:
5        urls: ["rediss://redis.example.com:6379"]
6        tls:
7          certificate_authorities: ${file./path/to/ca.crt}
8          client_authentication:
9            certificate_chain: ${file./path/to/certificate_chain.pem}
10            key: ${file./path/to/key.pem}

Required to start

When active, the required_to_start option will prevent the router from starting if it cannot connect to Redis. By default, the router will still start without a connection to Redis, which would result in only using the in-memory cache for query planning.

Reset TTL

When this option is active, accessing a cache entry in Redis will reset its expiration.

Pool size

The pool_size option defines the number of connections to Redis that the router will open. By default, the router will open a single connection to Redis. If there is a lot of traffic between router and Redis and/or there is some latency in those requests, it is recommended to increase the pool size to reduce that latency.

Cache warm-up with distributed caching

If the router uses distributed caching for query plans, the warm-up phase also stores the new query plans in Redis. Since all router instances might have the same distributions of queries in their in-memory cache, the list of queries is shuffled before warm-up, so each router instance can plan queries in a different order and share their results through the cache.

Feedback

Edit on GitHub

Ask Community