Join us from October 8-10 in New York City to learn the latest tips, trends, and news about GraphQL federation and API platform engineering.Join us for GraphQL Summit 2024 in NYC
Start for Free

Overload Protection

Implement overload protection for high traffic scenarios


A can implement overload protection to help it remain available while under high load. With overload protection, a server monitors its resource usage and begins shedding incoming traffic whenever that usage approaches a performance-degrading limit (such as running out of memory).

As you add capabilities and users to your , you might introduce new usage patterns that add unexpectedly high load. Overload protection helps reduce the impact of these spikes while you optimize your supergraph to eliminate them entirely.

Example scenarios

A common source of overload in a system is the thundering herd problem, where a large number of processes or clients attempt to access limited computer resources. Many scenarios can cause this, for example:

  • Pod failures in Kubernetes cause a smaller amount of pods to handle the same amount of traffic.
  • A marketing campaign or viral social media post drives high traffic to an application in a short period.
  • A newly deployed feature introduces more load on the than expected.

A more graph-based problem is adding an relationship in the schema that causes a significant increase in traffic. For example in the Star Wars schema, imagine if there was no link from Person to Film(though PersonFilmsConnection) and it was added today. Until the usage of that new connection in the schema flattens out, every deployment or event that causes traffic could cause a large amount of new load directly attributed to the change for the owner of the Film entity.

Implementing in Express

Overload protection packages are available for most popular languages and server frameworks. For example, we'll look at using the overload-protection package with the @apollo/server package. This drop-in package enables your server to return a 503 based any of the following:

  • The current event loop delay
  • The amount of bytes used by the heap
  • The amount of bytes used by Resident Set Size (RSS).

To use overload-protection, you include it in your Express startup like so:

import express from 'express';
import protect from 'overload-protection';
const app = express();


If you're currently using the startStandaloneServer function, you'll need to swap to expressMiddleware before adding overload protection.

If you're using @apollo/server's Express integration (that isexpressMiddleware), you can add overload-protection via Express middleware by adding the highlighted lines to your server creation:

import {ApolloServer} from '@apollo/server';
import {expressMiddleware} from '@apollo/server/express4';
import {ApolloServerPluginDrainHttpServer} from '@apollo/server/plugin/drainHttpServer';
import express from 'express';
import http from 'http';
import cors from 'cors';
import {typeDefs, resolvers} from './schema';
import protect from 'overload-protection';
const app = express();
const httpServer = http.createServer(app);
const server = new ApolloServer({
plugins: [ApolloServerPluginDrainHttpServer({httpServer})]
// Note the top-level `await` calls below!
await server.start();
app.use('/graphql', cors(), express.json(), expressMiddleware(server));
await new Promise(resolve => httpServer.listen({port: 4000}, resolve));
console.log(`🚀 Server ready at http://localhost:4000/graphql`);

This approach also works if you're using the @apollo/subgraph library to create your in a similar way.

Overload protection is not specific to , so it's best to handle it outside of Apollo software.

Protecting a supergraph

When adding overload protection to a supergraph, a reasonable question is, "Do I add protection to my gateway/ or to my individual subgraphs?" The short answer is "both":

  • Protecting the router protects the availability of the supergraph as a whole.
  • Protecting a reduces the error rate for queries that request data from that subgraph.


The main concern with the gateway is a buildup of requests that cause partial or cascading failures. If the gateway can't shed excessive load, its performance starts to degrade.

A single request to the gateway usually transforms into multiple requests to subgraphs, which can increase load more than expected for complex queries. In such cases, overload protection in the gateway can save it from falling over entirely. This looks like a temporary dip in availability instead of a total outage.


A failure in a subgraph can cause a backup in the gateway. If this backup is due to load, overload protection helps short-circuit the return of an error. This relieves the pressure in both the gateway and the subgraph by allowing the gateway to return errors faster.

Additional protections

To further protect against overloading your graph and system, you should enable limiting query depth and rate limiting.

Rate articleRateEdit on GitHubEditForumsDiscord

© 2024 Apollo Graph Inc.

Privacy Policy