Fetching from REST

Using RESTDataSource to fetch data from REST APIs


💡 tip
Learn how Apollo Connectors simplify incorporating REST APIs into your graph.

See the @apollo/datasource-rest README for the full details of the RESTDataSource API.

The RESTDataSource class simplifies fetching data from REST APIs and helps handle caching, request deduplication, and errors while resolving operations.

For more information about fetching from data sources other than a REST API, see Fetching Data.

Creating subclasses

To get started, install the @apollo/datasource-rest package:

Bash
1npm install @apollo/datasource-rest

Your server should define a separate subclass of RESTDataSource for each REST API it communicates with. Here's an example of a RESTDataSource subclass that defines two data-fetching methods, getMovie and getMostViewedMovies:

TypeScript
movies-api.ts
1import { RESTDataSource } from '@apollo/datasource-rest';
2
3class MoviesAPI extends RESTDataSource {
4  override baseURL = 'https://movies-api.example.com/';
5
6  async getMovie(id: string): Promise<Movie> {
7    return this.get<Movie>(`movies/${encodeURIComponent(id)}`);
8  }
9
10  async getMostViewedMovies(limit = '10'): Promise<Movie[]> {
11    const data = await this.get('movies', {
12      params: {
13        per_page: limit.toString(), // all params entries should be strings,
14        order_by: 'most_viewed',
15      },
16    });
17    return data.results;
18  }
19}

You can extend the RESTDataSource class to implement whatever data-fetching methods your resolvers need. These methods should use the built-in convenience methods (e.g., get and post) to perform HTTP requests, helping you add query parameters, parse and cache JSON results, dedupe requests, and handle errors. More complex use cases can use the fetch method directly. The fetch method returns both the parsed body and the response object, which provides more flexibility for use cases like reading response headers.

Adding data sources to your server's context function

In the examples below, we use top-level await calls to start our server asynchronously. Check out our Getting Started guide to see how we configured our project to support this.

You can add data sources to the context initialization function, like so:

TypeScript
index.ts
1interface ContextValue {
2  dataSources: {
3    moviesAPI: MoviesAPI;
4    personalizationAPI: PersonalizationAPI;
5  };
6}
7
8const server = new ApolloServer<ContextValue>({
9  typeDefs,
10  resolvers,
11});
12
13const { url } = await startStandaloneServer(server, {
14  context: async () => {
15    const { cache } = server;
16    return {
17      // We create new instances of our data sources with each request,
18      // passing in our server's cache.
19      dataSources: {
20        moviesAPI: new MoviesAPI({ cache }),
21        personalizationAPI: new PersonalizationAPI({ cache }),
22      },
23    };
24  },
25});
26
27console.log(`🚀  Server ready at ${url}`);

Apollo Server calls the context initialization function for every incoming operation. This means:

  • For every operation, context returns an object containing new instances of your RESTDataSource subclasses (in this case, MoviesAPI and PersonalizationAPI).

  • The context function should create a new instance of each RESTDataSource subclass for each operation. More details on why below.

Your resolvers can then access your data sources from the shared contextValue object and use them to fetch data:

TypeScript
resolvers.ts
1const resolvers = {
2  Query: {
3    movie: async (_, { id }, { dataSources }) => {
4      return dataSources.moviesAPI.getMovie(id);
5    },
6    mostViewedMovies: async (_, __, { dataSources }) => {
7      return dataSources.moviesAPI.getMostViewedMovies();
8    },
9    favorites: async (_, __, { dataSources }) => {
10      return dataSources.personalizationAPI.getFavorites();
11    },
12  },
13};

Caching

The RESTDataSource class provides its subclasses with two layers of caching:

  • The first layer deduplicates concurrent outgoing GET (and HEAD) requests by default. Deduplication is keyed on the request's method and URL. You can configure this behavior by overriding the requestDeduplicationPolicyFor method. For more details, see the README.

Note: In versions of RESTDataSource prior to v5, all outgoing GET requests are deduplicated. You can achieve this same behavior with the deduplicate-until-invalidated policy (explained further in the README).

  • The second layer caches the results from HTTP responses that specify HTTP caching headers.

These caching layers effectively make the RESTDataSource class a Node HTTP client that offers browser-style caching. Below, we'll dive into each layer of caching and the advantage that layer provides.

GET (and HEAD) requests and responses

Every time you instantiate a RESTDataSource subclass, under the hood that instance creates an internal cache. By default, RESTDataSource automatically deduplicates concurrent GET (and HEAD) requests (keyed by their method and URLs) alongside their results in this internal cache. This behavior is called request deduplication. You can configure this default behavior by overriding the requestDeduplicationPolicyFor method on the class.

The RESTDataSource class caches GET (and HEAD) requests and responses regardless of HTTP caching headers.

The request deduplication cache enables RESTDataSource to optimize the current operation by eliminating redundant GET (and HEAD) requests from different resolvers trying to get the same information. This works much like DataLoader's caching functionality.

As an example, let's say we have two RESTDataSource subclasses for fetching data from a Posts API and an Authors API. We can write a query fetching a post's content and that post's author's name:

GraphQL
1query GetPosts {
2  posts {
3    body
4    author {
5      name
6    }
7  }
8}

The above query provides an example of the classic N+1 problem. For every N number of posts, we'd supposedly make one more request to find the post's author's name (from an endpoint such as /authors/id_1).

This is a situation where RESTDataSource can optimize an operation using its cache of memoized GET requests and their responses.

The first time RESTDataSource makes a GET request (e.g., to /authors/id_1), it stores the request's URL before making that request. RESTDataSource then performs the request and stores the result alongside the request's URL in its memoized cache forever.

If any resolver in the current operation attempts a parallel GET request to the same URL, RESTDataSource checks its memoized cache before performing that request. If a request or a result exists in the cache, RESTDataSource returns (or waits to return) that stored result without making another request.

This internal caching mechanism is why we create a new RESTDataSource instance for every request. Otherwise, responses would be cached across requests even if they specify they shouldn't be!

You can change how GET (and HEAD) requests are stored in RESTDataSource's deduplication cache by overwriting the cacheKeyFor method. By default, a request's cache key is the combination of its HTTP method and URL.

To restore the deduplication policy from before RESTDataSource v5, you can configure requestDeduplicationPolicyFor like so: