The n+1 problem

2. The n+1 problem

Overview

Our GraphQL API is already equipped to serve up some basic listing data. We can run a query for featured listings, or ask for one listing in particular.

Furthermore, for each listing, we can also query for data about each Amenity it has to offer. But right now, we're facing a big performance issue with how this is implemented.

In this lesson, we will:

Learn about the n+1 problem
Discuss how to resolve it

Listings & amenities

To see our performance bottleneck in action, let's run a test query against our GraphQL API. It will query for a list of featured listings, along with some basic details about each listing's amenities.

A mock-up for Airlock, showing a row of featured listings and their amenities

Make sure the app is running either by running the following command in the root of the project.

./gradlew bootRun

Now, let's navigate to Apollo Sandbox Explorer, and paste in the address of our locally running server in the input at the top of the screen. By default, our server should be running on http://localhost:8080/graphql.

http://localhost:8080/graphql

https://studio.apollographql.com/sandbox/explorer

A screenshot of the Apollo Sandbox Explorer, highlighting the connection input with the locally running server's address

Let's begin our query by selecting the featuredListings field from our Query type in the Documentation panel. For each featured listing we query, we'll request the basics: just an id and title, along with a list of its amenities.

For each Amenity the listing has, we'll return id, name, and category.

Here's what our query should look like.

A query for featured listings and their amenities

query GetFeaturedListingsAmenities {
  featuredListings {
    id
    title
    amenities {
      id
      name
      category
    }
  }
}

Let's take this query for a spin and... we get data back! Great. So what's the problem, exactly?

To find out, we'll take a closer look at our terminal where our server is running. Run the query again, and... did you catch that? The terminal filled up with statements logging out:

The output every time we call the REST API

Calling for featured listings
Calling for amenities for listing listing-1
Calling for amenities for listing listing-2
Calling for amenities for listing listing-3

We see one line printed out here for each listing ID, and each of these represents a single request across the network to our data source. More requests than we probably expected from our lean and precise GraphQL query! Let's dive into what's happening here.

For every listing, a new request

The problem here is that we're making one request for the list of featured listings, and an additional request for each listing's list of amenities.

Here's a breakdown of how our query for featured listings and their amenities is resolved.

To get that list of featured listings, our datafetcher first calls the ListingService method that makes a request to the GET /featured-listings endpoint. This returns a JSON object containing our basic listing details.

A diagram showing the data that is returned when we query for featured listings

But this response doesn't actually contain any information about a listing's amenities. This means we make another request to GET /listings/{listing_id}/amenities for each listing, passing in its ID as the {listing_id} parameter.

A diagram showing the followup request needed for each listing's amenities

This extra request gets us the amenity data we need, but it has a hidden cost: every time the Listing.amenities datafetcher is executed, we make a new request to the REST API for amenities data.

The n+1 problem

This is the n+1 problem in action. We start with an initial request (the 1 in the n+1 equation), and this first request determines how many follow-up requests will be necessary (the n in the n+1 equation). The number of required follow-up requests, n, is not known until our first request is executed.

We saw this in action: our first request gave us our featured listings (there were three), but we then needed a follow-up request per listing to get the listing's amenities data.

This doesn't look too bad with just one or two additional requests, but it leads to some troubling situations as our queries scale. Imagine a list contains twenty-five listings ("Top 25 Sub-zero Summer Destinations!"); populating the data for a list like this means we'll send a total of 26 requests! One request to fetch listing data, and 25 additional requests to get the amenity information for each listing!

Practice

Which of the following situations illustrate the n+1 problem?

A query for the top ten best-selling books makes one request for the list of books, and a follow-up request for each book's author information.A query for a single listing on a vacation rental site requests its name, description, and all of its reviews.A query for product details makes one request to the database for some unknown number of products.

Key takeaways

The n+1 problem occurs when we make an initial request, followed by some unknown number of follow-up requests.

Up next

Let's dive into data loaders and how they help us solve this pesky problem.

Share your questions and comments about this lesson

This course is currently in

beta

. Your feedback helps us improve! If you're stuck or confused, let us know and we'll help you out. All comments are public and must follow the Apollo Code of Conduct. Note that comments that have been resolved or addressed may be removed.

You'll need a GitHub account to post below. Don't have one? Post in our Odyssey forum instead.