3. Data loaders under the hood
10m

Overview

In this lesson, we will:

  • Discuss what a data loader is
  • Introduce the dataloader package
  • Review the requirements of a data loader
  • Walk through how a data loader works under the hood

Data loaders

To solve the n+1 problem in our application, we'll use data loaders.

A data loader's primary job is to replace multiple similar requests with a single batched request. In our example, we saw three near-identical requests that used a particular listing ID to return amenity data. With a data loader, this becomes a single request that fetches data for all three listings at once.

A diagram showing listing Ids being batched by a data loader

We'll bring the power of data loaders into our class using the GraphQL dataloader package.

The dataloader package

The dataloader package is a utility that provides batching and caching capabilities. This lets us batch together various keys in a single request—serving up all the data, all at once!

Let's illustrate this with the following .

A query for featured listings and their amenities
query GetFeaturedListingsAmenities {
featuredListings {
id
title
amenities {
id
name
category
}
}
}

For each featured listing object this resolves, the Listing.amenities will be invoked. Currently, this resolver uses a listing's ID to resolve each request independently. (That's what results in three separate network requests to the same endpoint!)

The behavior we want is quite different: we want the data loader to collect all the listing IDs involved in the (also called keys), and execute a single request.

In our example, this means that when the Listing.amenities is called using each listing's ID, it won't call the ListingAPI's method that hits our REST API directly; instead, it will pass each key to a data loader method, which can batch together all the keys needed for the entire .

A diagram showing listing Ids batched by a data loader

Once the individual listing IDs are gathered in one list, the data loader can assume the responsibility of actually requesting data from the . It's able to dispatch a single request to the REST API endpoint for all of the IDs at once—a huge performance boost over letting the initiate a network request for each!

Best of all, our data loader method will automatically deduplicate the identifiers we pass them. This means if our included multiple listings with the same ID, we'll only request the listing's amenities once.

A diagram showing the data loader making a single REST request with all the collected IDs

What a data loader needs

Data loaders are exactly what we need to solve the performance issues in our app—but they come with a few requirements for us to consider. Let's walk through each of these one by one.

Data for multiple objects at once

Let's imagine our data loader has collected all of the different keys involved in our , and it's ready to fire off a request for data. What does it need next?

Well, if we think about the REST API endpoint we've used previously to return amenity data, we'll quickly see the problem: right now, the Listing.amenities relies on the ListingAPI method getAmenities to send each listing ID individually to the GET /listings/{listing_id}/amenities endpoint, which only returns data for a single listing.

That brings us to the first big requirement for a data loader to work as expected: we need a that can resolve a request for multiple objects simultaneously. In practice, this means that our data loader should be able to send off a list of keys (such as ["listing-1", "listing-2", "listing-3"]) and get back data for all of them.

The good news is that we do have a different endpoint in our REST API that we can use to request amenity data for multiple listings: GET /amenities/listings. It accepts multiple listing IDs joined as a single parameter string called ids, and returns data for them all at once.

A REST endpoint that receives multiple keys, and returns multiple values
GET /amenities/listings?ids=listing-1,listing-2,listing-3

For every key, a value

When a data loader sends off a list of keys in a request, it has a very clear expectation from the providing the data: the number of objects returned should never be greater than the number of keys that were sent in the request. The values returned should also be in the same order as the corresponding keys that were requested.

Let's break down this expectation and how the satisfies it.

In the process of resolving a , our might call the data loader three times, passing it three keys. The data loader groups them together into one list (["listing-1", "listing-2", "listing-3"]), then calls the GET /amenities/listings with them. What does it expect back? Well, it put in a list of three keys; it expects a list of no more than three objects back!

A diagram showing a request with three listing IDs; three lists of amenities are returned

For each key requested, we expect our to return a list of amenities. This is because each listing can have more than one amenity associated with it. If we request amenity data for three listing ids, therefore, our response should consist of three lists of amenities.

A diagram showing three listing IDs; and three lists of amenities that map to them

Note: The data loader also expects each object returned to align with the position of its corresponding key in the original request. For instance, if "listing-1" was sent as the first key, its list of amenities should be the first object in the response!

From there, the data loader handles the logic of mapping each list of amenities back to the key that requested it.

Data loader scope

There's one last important point to keep in mind. Data loaders and the set of keys they process at any one time should be limited to a single request. Accordingly, our implementation in the next lesson will ensure that a new DataLoader instance is created per request.

This means that if we run one for listing data, then a second query, the keys from both queries will NOT be batched together. Instead, each query will be resolved separately.

A diagram demonstrating how a data loader handles two queries separately

With these conceptual points cleared up, let's turn our attention back to the code. We'll update our Listing.amenities to call a new method in our ListingAPI class—and benefit from the power of a data loader!

Practice

Which of the following statements about data loaders is true?

Key takeaways

  • Data loaders let us batch a list of identifiers (such as IDs) in a single request rather than sending an individual request for each.
  • Before data loaders can work properly, our (whether another API, or a database) needs to implement a method that accepts multiple keys (such as IDs), and returns multiple objects.
  • The number of objects a data loader receives from a should not exceed the number of keys the data loader collected. (For instance, if a data loader requests data for three listings, it should receive no more than three listing objects back!)

Up next

We've learned about data loaders and the problem that they solve in our application. We also have a new method that accepts multiple listing IDs, and resolves multiple amenity objects. Next up, we'll implement the data loader logic that gathers up multiple listing IDs in a single request.

Previous

Share your questions and comments about this lesson

Your feedback helps us improve! If you're stuck or confused, let us know and we'll help you out. All comments are public and must follow the Apollo Code of Conduct. Note that comments that have been resolved or addressed may be removed.

You'll need a GitHub account to post below. Don't have one? Post in our Odyssey forum instead.