February 16, 2017

The next step for realtime data in GraphQL

Sashko Stubailo

Sashko Stubailo

If you’ve been following along with the GraphQL community, there’s a lot of buzz about the new RFC process for making improvements to the specification. The first feature being introduced through this process is GraphQL subscriptions, a way to add realtime data streaming to your GraphQL API. In this article, we’ll go over the backstory of how subscriptions came to be, what the proposal for subscriptions looks like, and how the RFC process will work and eventually end up with an exciting new part in the specification!

Existing GraphQL operation types

GraphQL is a great API query language for data fetching. If you haven’t heard about it or the benefits you can get, learn about it at graphql.org.

In the current version of the specification, GraphQL supports two types of operations:

  1. Queries —used for fetching data.
  2. Mutations — used for writing data.

You could see these as loosely corresponding to GET and POST requests in a REST API. In the spec, these operations are currently defined as the client sending one request, and getting one response, although people have already experimented with concepts like receiving multiple responses for one query.

Enter GraphQL subscriptions

Almost since GraphQL was released as an open source technology by Facebook in 2015, people have been having vibrant discussions about a new type of operation called GraphQL subscriptions, that adds a concept of server-side data pushing.

This idea was initially spurred by a talk and blog post from Laney Kuenzel and Dan Schafer from Facebook about their internal implementation of the new technology. What followed was a year of production testing at Facebook alongside experimentation in the community, largely around a community-oriented proposal and set of implementations put forth by the Apollo team.

Yesterday marked the first concrete step towards adding subscriptions to the specification: This pull request on the GraphQL spec repository from Facebook engineer Robert Zhu.

What’s a GraphQL subscription?

Before we get into the details of the proposal, what are we even talking about here? Well, in all of the discussions of subscriptions from Facebook’s conference talks and experiments in the community, here’s the high level consensus: A subscription is a GraphQL request that asks the server to push multiple results to the client in response to some server-side trigger. Just like queries and mutations, subscriptions are represented by a field on the Subscription type in your GraphQL schema, and you can query what you need from there. The main difference is that, unlike before, you can now get more than one result.

Example: Comment added

For example, you might have a subscription field called commentAdded(postId: ID!) which represents a stream of new comments added to a particular post. Then you could request it with the following GraphQL subscription operation:

subscription {
  commentAdded(postId: "ac55aa55") {
    comment {
      id
      content
      author { username }
    }
  } 
}

This might return several results over the lifetime of the subscription:

// Result 1
{
  "data": {
    "commentAdded": {
      "comment": {
        "id": "abc123",
        "content": "GraphQL subscriptions will be awesome!",
        "author": { "username": "sashko" }        
      }
    }
  }
}

// Result 2
{
  "data": {
    "commentAdded": {
      "comment": {
        "id": "def456",
        "content": "We should have a good RFC discussion!",
        "author": { "username": "robzhu" }        
      }
    }
  }
}

The most important thing here is that you don’t just get a small notification that the event you were interested in happened, but you can actually query for all of the data you need to know with the full power of GraphQL. Specifically, you don’t just get the ID of the new comment, but also information about the comment’s author if you need it, which might save you from making multiple roundtrips. You can start to see how this fits nicely with the core benefits of GraphQL.

Another example: Webhook

Subscriptions are not limited to the exact scenario above, however. The language in the RFC, and in the spec, is intentionally not opinionated about the transport mechanisms involved to make sure that subscriptions as a concept can be used in a variety of ways.

Just a small part of the data you get in a GitHub webhook.

My favorite example is webhooks, and you can look at GitHub as an example. When you ask GitHub to send you a webhook in the case of some event, you get a huge payload with a lot of fields, as you can see in the image.

Not only is this a lot of data for GitHub to put together every time they send you an event, but it might not even include everything you need, causing you to have to make additional roundtrips to get more data. Also, having a schema that describes the fields available on this event in a machine readable form will save you from having to read a lot of documentation, and make it much easier to test any code that relies on webhooks, which can be notoriously difficult today. I like this example because it breaks out of what people would normally think of as a subscription, but still fits into the proposal and demonstrates a lot of the possible benefits of GraphQL for this use case.

What is a subscription *not*?

As mentioned in the RFC document, a subscription is not a live query. What does that mean? The biggest difference is the contract it presents with the server. If GraphQL had an official feature called a “live query”, you would expect that to always provide you updates about the queried data. For example, you could run a live query on a particular post and its list of comments, and expect to get efficient updates when anything changes — the post content, comment content, new comments, deleted comments, and more. This is notably different because subscriptions only rerun in very specific cases, and are not guaranteed to notify you about all kinds of changes to data.

This makes subscriptions a much more targeted tool, and much simpler for people to implement on top of today’s backends and messaging systems than live queries, which might require a fully reactive backend.

What’s in a specification?

The purpose of the GraphQL specification is to be as clear as possible, but also minimal — it doesn’t include anything except the essentials. For example, the specification about queries and mutations refers to the concept of a request and a response, but doesn’t talk specifically about HTTP, proxies, CDNs, load balancing, or any other details you would actually need to think about to run a GraphQL server in production.

I expect the eventual subscriptions specification to be the same. It will specify the syntax involved, what kinds of results the GraphQL execution engine should return in different situations, and the high-level algorithms involved, but it won’t address operational concerns like picking between websockets and HTTP server push, selecting a server-side messaging system, or how to run a stateful server in a scalable way. These questions will be up to the developers implementing the system.

The specification should contain enough information to enable people to have a shared understanding of the subscriptions concept and be able to implement compatible tools, like a version of GraphiQL that enables you to inspect subscriptions. But it shouldn’t limit the diverse set of ways in which that concept can be implemented, since everyone has a different set of needs and technologies.

Next steps for the RFC

The process is off to a great start: the engineers at Facebook working on this RFC have been open to feedback and discussions with maintainers of popular GraphQL libraries. Further, it’s reassuring that the current proposal takes into account both internal needs at Facebook and the existing work from the community. Where might the RFC go from here?

Discussion and diverse use cases

The most important thing to happen next is a wider discussion considering a diverse range of viewpoints from GraphQL users and library authors. There are a lot of organizations that could benefit from a feature like subscriptions in GraphQL, and some of them might have specific needs or use cases that are not yet addressed by the current proposal. That’s why the initial design is a high-level sketch rather than a detailed, line-by-line patch to the specification — if the initial document was too detailed it would be hard to bring up new ideas. Here are some of the types of people that might especially want to participate in this discussion:

  1. Organizations using GraphQL in production
  2. Open source server and client authors
  3. Organizations that might adopt GraphQL if it gains a subscription feature

Now is the time to chime in and have your voice heard by commenting on the pull request.

Illumination of open questions

As I mentioned, the current proposal doesn’t address every detail. One of the meta questions to answer is what should and should not be specified in this feature, and what specifics are left to be agreed upon. Here are some topics that could benefit from further discussion:

  1. Unsubscribing. Does unsubscribing happen over a GraphQL-specific feature like a syntax in the language, or is it transport specific so it should be outside of GraphQL?
  2. Multiple subscriptions in one request. What happens when you have multiple root fields or multiple subscription operations in one request? That would seem to imply subscribing to multiple kinds of events, so it’s not yet clear what should happen. If that is allowed, do they rerun at the same time, or separately? Alternatively, perhaps only one root field or subscription operation should be allowed on a subscription request.
  3. Errors. How are errors sent? The set of possible errors in the system increases when a stateful mechanism is introduced. What happens when the server decides to stop sending new subscription results, or the current user logs out, or one particular subscription execution has a runtime error? I suspect this won’t be very controversial, but is one of the things that should be nailed down for the spec to be useful.
  4. Transport requirements. What transport features are required to implement this? Can you have a subscription that is one request and one response, or is it critical to have the ability to send multiple responses? Does the transport need an explicit concept of being “connected” or does the supported set of transports include something like a push notification, where you can’t tell if the receiver is currently online?

The answers to these questions don’t change the concept of GraphQL subscriptions fundamentally, but discussing these and more like them will get the most value out of the RFC process. For people like myself who work on open source tools for the community, having as much specificity over the details as possible makes it easier to create tools that will be compatible across all implementations.

Experimentation and implementation

Some of the above questions and feedback can be gained from hands-on experimentation. It’s important to realize that this is a stage in the process where things can still change dramatically, but if you are in a setting where you can iterate quickly as the specification matures, it’s worth setting up an experiment or two to see what this new feature could do for you. That can be a good way to get extremely specific and practical feedback on the proposed design.

Arriving at the final spec update

It’s great that the current proposal includes a lot of information about the concepts involved and isn’t something like a patch on the actual specification text. That makes it much easier and lower-cost for as many people as possible to get involved in the discussion without having to debate specification language.

We hope that what follows is iteration on the proposal document with new questions and answers as the community arrives at a consensus. This should ultimately make it simple to have someone write up a detailed specification and reference implementation from that set of conclusions, with no particular surprises.

We can’t wait to see where this proposal goes from here. This is the first major RFC to the GraphQL specification, and also introduces the new process for feature additions the GraphQL core team has been working on. Let’s all put our heads together and make it as successful as possible!

Trying subscriptions in your GraphQL.js app

Here on the Apollo team, we’ve been passionate about realtime data in GraphQL, and subscriptions in particular, for a long time. Consequently, we’ve been working on some experimental implementations for the proposed design, and even testing them out in our production applications. That’s been going really well, and I encourage you to try subscriptions in your own app.

It’s not very difficult to implement your own compliant subscriptions system from scratch, but if you want to get started quickly, we’ve built some packages you can easily drop in if you are running a JavaScript server and client. Please try them and contribute on GitHub!

graphql-subscriptions

The core is this package which implements all of the necessary state and lifecycle to manage a GraphQL subscription inside your Node.js GraphQL server. It doesn’t implement anything beyond what a future specification would cover, so it’s completely independent of transport, server libraries, client design, etc. It also includes adapters to some popular backend messaging systems, including Redis and MQTT.

subscriptions-transport-ws

This is a websocket transport library that implements a simple client and server that manages the lifecycle of subscribing, unsubscribing, authentication, and error handling over the network. You don’t need any additional libraries like Socket.io or similar, since this one package handles everything you need for GraphQL subscriptions. It assumes that you will be sending queries and mutations over the “standard” HTTP transport, and only subscriptions over a websocket.

To see both of these packages in action together, check out this simple file in our GitHunt example app. We also have a live example up! Open this comment feed in multiple tabs and submit a comment, then see it show up on the other screen.

Also check out our docs about subscriptions:

  1. Subscriptions React docs: How to handle subscriptions in your React app
  2. Subscriptions GraphQL Server docs: How to add GraphQL Subscriptions to your Node server

As the RFC for GraphQL subscriptions matures, we’ll keep these packages up to date so they can serve as reference implementations for anyone who wants to try out the newest version of the specification. But be warned — this means changes can happen if the consensus shifts. So if you are using this in production, make sure you are equipped to migrate your application as needed to the newest versions.


For more background about GraphQL subscriptions from the Apollo team, read our previous blog posts:

And of course, if you want to work on GraphQL technology full time, please consider applying for a job at our company.

Written by

Sashko Stubailo

Sashko Stubailo

Read more by Sashko Stubailo