Multi-Region Edge Cache Architecture
Deploy GraphOS Router with Redis as a globally distributed edge cache
This guide describes a reference architecture for deploying GraphOS Router with Redis as part of a globally distributed edge caching system. Use this pattern when you need low-latency GraphQL responses across multiple geographic regions with consistent cache invalidation.
When to use this architecture
Consider this architecture when you need:
Global low-latency responses: Serve users from the nearest region with cached data
High availability: Regional failures don't take down your entire GraphQL API
Shared cache across router instances: Multiple router replicas in a region share cached data
Consistent invalidation: Changes propagate to all regions quickly
This pattern is more complex than a single-region deployment. For simpler use cases, see the Response Caching Quickstart.
Architecture overview
The following diagram shows a multi-region deployment with tiered caching:
Architecture components
| Layer | Component | Purpose |
|---|---|---|
| Edge | Global Load Balancer + CDN | Route requests to nearest region, cache GET responses at edge |
| Edge | WAF (Web Application Firewall) | Protect against malicious requests |
| L1 | In-process cache | Query plan caching, hot data with microsecond latency |
| L2 | Regional Redis | Shared response cache across router replicas in a region |
| L3 | Global distributed store | Optional cross-region cache for expensive computations |
| Control | Pub/Sub | Broadcast invalidation events to all regions |
| Control | Change Data Capture | Trigger invalidations from database changes |
Cache tiers
L1: In-process cache
Each router instance maintains an in-process cache for:
Query plans: Avoid re-planning identical queries
APQ (Automatic Persisted Queries): Map query hashes to full query text
This cache is local to each router instance and provides microsecond-level latency. Configure query plan caching:
1supergraph:
2 query_planning:
3 cache:
4 in_memory:
5 limit: 512 # Number of query plans to cacheFor high-traffic deployments, you can also back the query plan cache with Redis for sharing across instances. See Query Plan Caching.
L2: Regional Redis
Regional Redis provides a shared cache for all router instances in a region. This is where response caching stores cached subgraph responses.
1response_cache:
2 enabled: true
3 subgraph:
4 all:
5 enabled: true
6 ttl: 3600s # 1 hour default TTL
7 redis:
8 urls: ["redis://redis.us-east1.internal:6379"]
9 pool_size: 10
10 namespace: "router:response_cache"For multi-region deployments, configure each region's routers to use the regional Redis instance:
1response_cache:
2 subgraph:
3 all:
4 redis:
5 urls: ["redis://redis.us-east1.internal:6379"]1response_cache:
2 subgraph:
3 all:
4 redis:
5 urls: ["redis://redis.europe-west1.internal:6379"]1response_cache:
2 subgraph:
3 all:
4 redis:
5 urls: ["${env.REDIS_URL}"]Redis high availability
For production deployments, use Redis with high availability:
Redis Cluster: Horizontal scaling with automatic sharding
Redis Sentinel: Automatic failover for single-primary setups
Managed Redis: Cloud provider managed services (AWS ElastiCache, GCP Memorystore, Azure Cache for Redis)
1response_cache:
2 subgraph:
3 all:
4 redis:
5 urls: ["redis-cluster://node1:6379?node=node2:6379&node=node3:6379"]See Redis URL Configuration for connection string formats.
L3: Global distributed store (optional)
For data that's expensive to compute and rarely changes, you can add a global L3 cache tier using a distributed database like Cloud Bigtable, DynamoDB Global Tables, or CockroachDB.
The L3 tier is not a built-in router feature. You would implement it as:
A coprocessor that checks L3 before forwarding to subgraphs
Custom logic in your subgraphs that checks L3 before querying origin databases
This tier is only necessary for specific use cases where cross-region cache sharing provides significant cost savings.
Subgraph caching
Subgraphs can maintain their own Redis cache, independent of the router's response cache. This is useful when:
Subgraphs have expensive data fetching operations
Multiple fields share the same underlying data
You want caching at the resolver level
The router's response cache and subgraph caches serve different purposes:
| Cache | What it caches | Invalidation |
|---|---|---|
| Router response cache | Subgraph HTTP responses (entity representations) | Via router invalidation API |
| Subgraph cache | Resolver-level data, database query results | Subgraph-specific logic |
Event-driven invalidation
In a multi-region deployment, cache invalidation must propagate to all regions. Use a pub/sub system to broadcast invalidation events.
Invalidation flow
Router invalidation endpoint
Configure each router to expose an invalidation endpoint:
1response_cache:
2 enabled: true
3 invalidation:
4 listen: "0.0.0.0:4000"
5 path: "/invalidation"
6 subgraph:
7 all:
8 enabled: true
9 redis:
10 urls: ["${env.REDIS_URL}"]
11 invalidation:
12 enabled: true
13 shared_key: "${env.INVALIDATION_SHARED_KEY}"Invalidation service
Create an invalidation service in each region that:
Subscribes to the pub/sub topic
Transforms events into router invalidation requests
Calls the router's invalidation endpoint
Example invalidation request:
1curl --request POST \
2 --header "Authorization: ${INVALIDATION_SHARED_KEY}" \
3 --header "Content-Type: application/json" \
4 --url http://router:4000/invalidation \
5 --data '[{
6 "kind": "cache_tag",
7 "subgraphs": ["products"],
8 "cache_tag": "product-42"
9 }]'See Cache Invalidation for all invalidation methods.
Change data capture
Use change data capture (CDC) to automatically trigger invalidations when database records change:
Debezium: Open source CDC for various databases
Cloud-native CDC: AWS DMS, GCP Datastream, Azure Data Factory
CDC captures database changes and publishes them to your pub/sub system, which then triggers cache invalidation across all regions.
Edge layer integration
CDN caching
A CDN can cache GraphQL responses at the edge for read-heavy workloads. This works best for:
GET requests: Queries sent as GET requests with query parameters
Public data: Responses without user-specific content
High cache hit rates: Popular queries requested by many users
Configure your CDN to:
Cache responses based on the full URL (including query parameters)
Respect
Cache-Controlheaders from the routerForward cache misses to the nearest router region
The router includes Cache-Control headers in responses based on the minimum TTL of cached entities.
APQ with GET requests
Automatic Persisted Queries (APQ) enable sending queries as GET requests, making them cacheable by CDNs:
1apq:
2 enabled: true
3 router:
4 cache:
5 redis:
6 urls: ["${env.REDIS_URL}"]With APQ, clients send a query hash instead of the full query text. The CDN can cache responses by hash, and the router resolves hashes to full queries from Redis.
Multi-region deployment
Regional router configuration
Each region needs routers configured with:
Regional Redis URL
Regional subgraph endpoints (or cross-region if subgraphs aren't deployed locally)
Use environment variables or a configuration management system to manage region-specific settings.
Cross-region subgraph routing
In the architecture diagram, europe-west1 doesn't have local subgraphs—it routes to us-east1 subgraphs cross-region. This is a valid pattern when:
Some regions only need router + cache (read-heavy, latency-tolerant)
Subgraph deployment is expensive or complex
Data residency requirements allow it
Configure cross-region routing with longer timeouts to account for network latency:
1traffic_shaping:
2 all:
3 timeout: 30s # Longer timeout for cross-region calls
4 subgraphs:
5 products:
6 timeout: 45s # Even longer for slow subgraphsMonitoring
Monitor cache effectiveness across regions:
1telemetry:
2 instrumentation:
3 instruments:
4 cache:
5 apollo.router.operations.response_cache:
6 attributes:
7 subgraph.name:
8 subgraph_name: trueKey metrics to track:
| Metric | What it tells you |
|---|---|
apollo.router.operations.response_cache.hit | Cache hit rate by subgraph |
apollo.router.operations.response_cache.miss | Requests hitting origin |
apollo.router.cache.storage.estimated_size | Cache memory usage |
| Redis latency | Network overhead for cache operations |
See Response Cache Observability for detailed monitoring guidance.
Implementation checklist
Use this checklist when implementing the architecture:
Redis per region: Deploy Redis with high availability in each region
Router fleet: Deploy multiple router replicas per region behind a load balancer
Invalidation endpoint: Configure and secure the invalidation endpoint
Pub/Sub: Set up pub/sub topics for invalidation events
Invalidation services: Deploy subscribers in each region
CDC (optional): Configure change data capture for automatic invalidation
CDN: Configure CDN caching rules for GET requests
Monitoring: Set up dashboards for cache metrics across regions
Alerting: Alert on cache hit rate drops, Redis connectivity issues