Subgraph Entity Caching for the GraphOS Router
Configure Redis-backed caching for entities
Want to learn about graph performance in-person?
Don't miss the Better, faster, stronger: Advanced configurations to boost your router performance workshop at this year's GraphQL Summit.
This feature is only available with a GraphOS Enterprise plan.
You can test it out by signing up for a free Enterprise trial.
This feature is in preview. Your questions and feedback are highly valued—don't hesitate to get in touch with your Apollo contact.
Learn how the GraphOS Router can cache subgraph query responses using Redis to improve your query latency for entities in the supergraph.
Overview
An entity gets its fields from one or more subgraphs. To respond to a client request for an entity, the GraphOS Router must make multiple subgraph requests. Different clients requesting the same entity can make redundant, identical subgraph requests.
Entity caching enables the router to respond to identical subgraph queries with cached subgraph responses. The router uses Redis to cache data from subgraph query responses. Because cached data is keyed per subgraph and entity, different clients making the same client query—with the same or different query arguments—hit the same cache entries of subgraph response data.
Benefits of entity caching
Compared to caching entire client responses, entity caching supports finer control over:
- the time to live (TTL) of cached data
- the amount of data being cached
When caching an entire client response, the router must store it with a shorter TTL because application data can change often. Real-time data needs more frequent updates.
A client-response cache might not be shareable between users, because the application data might contain personal and private information. A client-response cache might also duplicate a lot of data between client responses.
For example, consider the Products
and Inventory
subgraphs from the Entities guide:
type Product @key(fields: "id") {id: ID!name: String!price: Int}
type Product @key(fields: "id") {id: ID!inStock: Boolean!}
Assume the client for a shopping cart application requests the following for each product in the cart:
- The product's name and price from the
Products
subgraph. - The product's availability in inventory from the
Inventory
subgraph.
If caching the entire client response, it would require a short TTL because the cart data can change often and the real-time inventory has to be up to date. A client-response cache couldn't be shared between users, because each cart is personal. A client-response cache might also duplicate data because the same products might appear in multiple carts.
With entity caching enabled for this example, the router can:
- Store each product's description and price separately with a long TTL.
- Minimize the number of subgraph requests made for each client request, with some client requests fetching all product data from the cache and requiring no subgraph requests.
- Share the product cache between all users.
- Cache the cart per user, with a small amount of data.
- Cache inventory data with a short TTL or not cache it at all.
Use entity caching
Follow this guide to enable and configure entity caching in the GraphOS Router.
Prerequisites
To use entity caching in the GraphOS Router, you must set up:
- A Redis instance or cluster that your router instances can communicate with
- A GraphOS Enterprise plan that connects your router to GraphOS.
Configure router for entity caching
In router.yaml
, configure preview_entity_cache
:
- Enable entity caching globally.
- Configure Redis using the same conventions described in distributed caching.
- Configure entity caching per subgraph, with overrides per subgraph for disabling entity caching and TTL.
For example:
# Enable entity caching globallypreview_entity_cache:enabled: truesubgraph:all:enabled: true# Configure Redisredis:urls: ["redis://..."]timeout: 2s # Optional, by default: 500msttl: 24h # Optional, by default no expiration# Configure entity caching per subgraph, overrides options from the "all" sectionsubgraphs:products:ttl: 120s # overrides the global TTLinventory:enabled: false # disable for a specific subgraphaccounts:private_id: "user_id"
Configure time to live (TTL)
To decide whether to cache an entity, the router honors the Cache-Control
header returned with the subgraph response. Because Cache-Control
might not contain a max-age
or s-max-age
option, a default TTL must either be defined per subgraph configuration or inherited from the global configuration.
The router also generates a Cache-Control
header for the client response by aggregating the TTL information from all response parts. If a subgraph doesn't return the header, its response is assumed to be no-store
.
Customize Redis cache key
If you need to store data for a particular request in different cache entries, you can configure the cache key through the apollo_entity_cache::key
context entry.
This entry contains an object with the all
field to affect all subgraph requests under one client request, and fields named after subgraph operation names to affect individual subgraph queries. The field's value can be any valid JSON value (object, string, etc).
{"all": 1,"subgraph_operation1": "key1","subgraph_operation2": {"data": "key2"}}
Private information caching
A subgraph can return a response with the header Cache-Control: private
, indicating that it contains user-personalized data. Although this usually forbids intermediate servers from storing data, the router may be able to recognize different users and store their data in different parts of the cache.
To set up private information caching, you can configure the private_id
option. private_id
is a string pointing at a field in the request context that contains data used to recognize users (for example, user id, or sub
claim in JWT).
As an example, if you are using the router's JWT authentication plugin, you can first configure the private_id
option in the accounts
subgraph to point to the user_id
key in context, then use a Rhai script to set that key from the JWT's sub
claim:
preview_entity_cache:enabled: truesubgraph:all:enabled: trueredis:urls: ["redis://..."]subgraphs:accounts:private_id: "user_id"authentication:router:jwt:jwks:- url: https://auth-server/jwks.json
fn supergraph_service(service) {let request_callback = |request| {let claims = request.context[Router.APOLLO_AUTHENTICATION_JWT_CLAIMS];if claims != () {let private_id = claims["sub"];request.context["user_id"] = private_id;}};service.map_request(request_callback);}
The router implements the following sequence to determine whether a particular query returns private data:
- Upon seeing a query for the first time, the router requests the cache as if it were a public-only query.
- When the subgraph returns the response with private data, the router recognizes it and stores the data in a user-specific part of the cache.
- The router stores the query in a list of known queries with private data.
- When the router subsequently sees a known query:
- If the private id isn't provided, the router doesn't interrogate the cache, but it instead transmits the subgraph response directly.
- If the private id is provided, the router queries the part of the cache for the current user and checks the subgraph if nothing is available.
Observability
The router supports a cache
selector in telemetry for the subgraph service. The selector returns the number of cache hits or misses by an entity for a subgraph request.
Spans
You can add a new attribute on the subgraph span for the number of cache hits. For example:
telemetry:instrumentation:spans:subgraph:attributes:cache.hit:cache: hit
Metrics
The router provides the telemetry.instrumentation.instruments.cache
instrument to enable cache metrics:
telemetry:instrumentation:instruments:cache: # Cache instruments configurationapollo.router.operations.entity.cache: # A counter which counts the number of cache hit and miss for subgraph requestsattributes:entity.type: true # Include the entity type name. default: falsesubgraph.name: # Custom attributes to include the subgraph name in the metricsubgraph_name: truesupergraph.operation.name: # Add custom attribute to display the supergraph operation namesupergraph_operation_name: string# You can add more custom attributes using subgraph selectors
You can use custom instruments to create metrics for the subgraph service. The following example creates a custom instrument to generate a histogram that measures the subgraph request duration when there's at least one cache hit for the "inventory" subgraph:
telemetry:instrumentation:instruments:subgraph:only_cache_hit_on_subgraph_inventory:type: histogramvalue: durationunit: hitdescription: histogram of subgraph request duration when we have cache hit on subgraph inventorycondition:all:- eq:- subgraph_name: true # subgraph selector- inventory- gt: # If the number of cache hit is greater than 0- cache: hit# entity_type: Product # Here you could also only check for the entity type Product, it's `all` by default if we don't specify this config.- 0
Implementation notes
Cache-Control header requirement
The Router currently cannot know which types or fields should be cached, so it requires the subgraph to set a Cache-Control
header in its response to indicate that it should be stored.
Responses with errors not cached
To prevent transient errors from affecting the cache for a long duration, subgraph responses with errors are not cached.
Cached entities with unavailable subgraph
If some entities were obtained from the cache, but the subgraphs that provided them are unavailable, the router will return a response with the cached entities, and the other entities nullified (schema permitting), along with an error message for the nullified entities.
Authorization and entity caching
When used alongside the router's authorization directives, cache entries are separated by authorization context. If a query contains fields that need a specific scope, the requests providing that scope have different cache entries from those not providing the scope. This means that data requiring authorization can still be safely cached and even shared across users, without needing invalidation when a user's roles change because their requests are automatically directed to a different part of the cache.
Schema updates and entity caching
On schema updates, the router ensures that queries unaffected by the changes keep their cache entries. Queries with affected fields need to be cached again to ensure the router doesn't serve invalid data from before the update.
Entity cache invalidation not supported
Cache invalidation is not yet supported and is planned for a future release.