Multi-Region Edge Cache Architecture

Deploy GraphOS Router with Redis as a globally distributed edge cache


This guide describes a reference architecture for deploying GraphOS Router with Redis as part of a globally distributed edge caching system. Use this pattern when you need low-latency GraphQL responses across multiple geographic regions with consistent cache invalidation.

When to use this architecture

Consider this architecture when you need:

  • Global low-latency responses: Serve users from the nearest region with cached data

  • High availability: Regional failures don't take down your entire GraphQL API

  • Shared cache across router instances: Multiple router replicas in a region share cached data

  • Consistent invalidation: Changes propagate to all regions quickly

This pattern is more complex than a single-region deployment. For simpler use cases, see the Response Caching Quickstart.

Architecture overview

The following diagram shows a multi-region deployment with tiered caching:

Architecture components

LayerComponentPurpose
EdgeGlobal Load Balancer + CDNRoute requests to nearest region, cache GET responses at edge
EdgeWAF (Web Application Firewall)Protect against malicious requests
L1In-process cacheQuery plan caching, hot data with microsecond latency
L2Regional RedisShared response cache across router replicas in a region
L3Global distributed storeOptional cross-region cache for expensive computations
ControlPub/SubBroadcast invalidation events to all regions
ControlChange Data CaptureTrigger invalidations from database changes

Cache tiers

L1: In-process cache

Each router instance maintains an in-process cache for:

  • Query plans: Avoid re-planning identical queries

  • APQ (Automatic Persisted Queries): Map query hashes to full query text

This cache is local to each router instance and provides microsecond-level latency. Configure query plan caching:

YAML
router.yaml
1supergraph:
2  query_planning:
3    cache:
4      in_memory:
5        limit: 512 # Number of query plans to cache

For high-traffic deployments, you can also back the query plan cache with Redis for sharing across instances. See Query Plan Caching.

L2: Regional Redis

Regional Redis provides a shared cache for all router instances in a region. This is where response caching stores cached subgraph responses.

YAML
router.yaml
1response_cache:
2  enabled: true
3  subgraph:
4    all:
5      enabled: true
6      ttl: 3600s # 1 hour default TTL
7      redis:
8        urls: ["redis://redis.us-east1.internal:6379"]
9        pool_size: 10
10        namespace: "router:response_cache"

For multi-region deployments, configure each region's routers to use the regional Redis instance:

YAML
router.yaml (us-east1)
1response_cache:
2  subgraph:
3    all:
4      redis:
5        urls: ["redis://redis.us-east1.internal:6379"]
YAML
router.yaml (europe-west1)
1response_cache:
2  subgraph:
3    all:
4      redis:
5        urls: ["redis://redis.europe-west1.internal:6379"]
tip
Use environment variables to configure region-specific Redis URLs:
YAML
router.yaml
1response_cache:
2  subgraph:
3    all:
4      redis:
5        urls: ["${env.REDIS_URL}"]

Redis high availability

For production deployments, use Redis with high availability:

  • Redis Cluster: Horizontal scaling with automatic sharding

  • Redis Sentinel: Automatic failover for single-primary setups

  • Managed Redis: Cloud provider managed services (AWS ElastiCache, GCP Memorystore, Azure Cache for Redis)

YAML
router.yaml (Redis Cluster)
1response_cache:
2  subgraph:
3    all:
4      redis:
5        urls: ["redis-cluster://node1:6379?node=node2:6379&node=node3:6379"]

See Redis URL Configuration for connection string formats.

L3: Global distributed store (optional)

For data that's expensive to compute and rarely changes, you can add a global L3 cache tier using a distributed database like Cloud Bigtable, DynamoDB Global Tables, or CockroachDB.

The L3 tier is not a built-in router feature. You would implement it as:

  • A coprocessor that checks L3 before forwarding to subgraphs

  • Custom logic in your subgraphs that checks L3 before querying origin databases

This tier is only necessary for specific use cases where cross-region cache sharing provides significant cost savings.

Subgraph caching

Subgraphs can maintain their own Redis cache, independent of the router's response cache. This is useful when:

  • Subgraphs have expensive data fetching operations

  • Multiple fields share the same underlying data

  • You want caching at the resolver level

The router's response cache and subgraph caches serve different purposes:

CacheWhat it cachesInvalidation
Router response cacheSubgraph HTTP responses (entity representations)Via router invalidation API
Subgraph cacheResolver-level data, database query resultsSubgraph-specific logic

Event-driven invalidation

In a multi-region deployment, cache invalidation must propagate to all regions. Use a pub/sub system to broadcast invalidation events.

Invalidation flow

Router invalidation endpoint

Configure each router to expose an invalidation endpoint:

YAML
router.yaml
1response_cache:
2  enabled: true
3  invalidation:
4    listen: "0.0.0.0:4000"
5    path: "/invalidation"
6  subgraph:
7    all:
8      enabled: true
9      redis:
10        urls: ["${env.REDIS_URL}"]
11      invalidation:
12        enabled: true
13        shared_key: "${env.INVALIDATION_SHARED_KEY}"
caution
Only expose the invalidation endpoint to internal networks. Use network policies or service mesh to restrict access.

Invalidation service

Create an invalidation service in each region that:

  1. Subscribes to the pub/sub topic

  2. Transforms events into router invalidation requests

  3. Calls the router's invalidation endpoint

Example invalidation request:

Bash
1curl --request POST \
2  --header "Authorization: ${INVALIDATION_SHARED_KEY}" \
3  --header "Content-Type: application/json" \
4  --url http://router:4000/invalidation \
5  --data '[{
6    "kind": "cache_tag",
7    "subgraphs": ["products"],
8    "cache_tag": "product-42"
9  }]'

See Cache Invalidation for all invalidation methods.

Change data capture

Use change data capture (CDC) to automatically trigger invalidations when database records change:

  • Debezium: Open source CDC for various databases

  • Cloud-native CDC: AWS DMS, GCP Datastream, Azure Data Factory

CDC captures database changes and publishes them to your pub/sub system, which then triggers cache invalidation across all regions.

Edge layer integration

CDN caching

A CDN can cache GraphQL responses at the edge for read-heavy workloads. This works best for:

  • GET requests: Queries sent as GET requests with query parameters

  • Public data: Responses without user-specific content

  • High cache hit rates: Popular queries requested by many users

Configure your CDN to:

  1. Cache responses based on the full URL (including query parameters)

  2. Respect Cache-Control headers from the router

  3. Forward cache misses to the nearest router region

The router includes Cache-Control headers in responses based on the minimum TTL of cached entities.

APQ with GET requests

Automatic Persisted Queries (APQ) enable sending queries as GET requests, making them cacheable by CDNs:

YAML
router.yaml
1apq:
2  enabled: true
3  router:
4    cache:
5      redis:
6        urls: ["${env.REDIS_URL}"]

With APQ, clients send a query hash instead of the full query text. The CDN can cache responses by hash, and the router resolves hashes to full queries from Redis.

Multi-region deployment

Regional router configuration

Each region needs routers configured with:

  • Regional Redis URL

  • Regional subgraph endpoints (or cross-region if subgraphs aren't deployed locally)

Use environment variables or a configuration management system to manage region-specific settings.

Cross-region subgraph routing

In the architecture diagram, europe-west1 doesn't have local subgraphs—it routes to us-east1 subgraphs cross-region. This is a valid pattern when:

  • Some regions only need router + cache (read-heavy, latency-tolerant)

  • Subgraph deployment is expensive or complex

  • Data residency requirements allow it

Configure cross-region routing with longer timeouts to account for network latency:

YAML
router.yaml
1traffic_shaping:
2  all:
3    timeout: 30s # Longer timeout for cross-region calls
4  subgraphs:
5    products:
6      timeout: 45s # Even longer for slow subgraphs

Monitoring

Monitor cache effectiveness across regions:

YAML
router.yaml
1telemetry:
2  instrumentation:
3    instruments:
4      cache:
5        apollo.router.operations.response_cache:
6          attributes:
7            subgraph.name:
8              subgraph_name: true

Key metrics to track:

MetricWhat it tells you
apollo.router.operations.response_cache.hitCache hit rate by subgraph
apollo.router.operations.response_cache.missRequests hitting origin
apollo.router.cache.storage.estimated_sizeCache memory usage
Redis latencyNetwork overhead for cache operations

See Response Cache Observability for detailed monitoring guidance.

Implementation checklist

Use this checklist when implementing the architecture:

  • Redis per region: Deploy Redis with high availability in each region

  • Router fleet: Deploy multiple router replicas per region behind a load balancer

  • Invalidation endpoint: Configure and secure the invalidation endpoint

  • Pub/Sub: Set up pub/sub topics for invalidation events

  • Invalidation services: Deploy subscribers in each region

  • CDC (optional): Configure change data capture for automatic invalidation

  • CDN: Configure CDN caching rules for GET requests

  • Monitoring: Set up dashboards for cache metrics across regions

  • Alerting: Alert on cache hit rate drops, Redis connectivity issues

Feedback

Edit on GitHub

Ask Community