September 29, 2016

GraphQL subscriptions with Redis Pub Sub

David Yahalomi

David Yahalomi

An overall system diagram of GraphQL subscriptions with Redis.

A couple of weeks ago I contacted Jonas Helfer to contribute to the current work being done over on the Apollo project. After a quick talk, we realized their current GraphQL subscriptions implementation could really benefit from making it easy to plug in external Pub Sub systems. For the initial release of the feature, the packages were decoupled enough to allow me to implement just that: Redis as a Pub Sub Engine for GraphQL subscriptions.

Required Reading

To learn how to use the core subscriptions feature, take a look at Amanda Liu’s subscriptions post:GraphQL Subscriptions in Apollo ClientExperimental web socket system for near-realtime updatesmedium.com

The package introduced there uses an in-memory event system to re-run subscriptions, so there is no way to share subscriptions and publishes across many running servers. That’s one reason why we needed to add Redis or another external pub/sub system to the mix.

TL;DR

If you already know how to use GraphQL subscriptions and configure a Redis server, and just want to connect those together, all you need to do is install my package:

npm i -S graphql-redis-subscriptions

After that, replace the pubsub field on the SubscriptionManager with the following :

import { RedisPubSub } from 'graphql-redis-subscriptions';const pubsub = new RedisPubSub();const subscriptionManager = new SubscriptionManager({
  schema,
  pubsub,
  setupFunctions: {},
});

To make changes to the default Redis connection options, you can pass in a Redis options object like so:

const pubsub = new RedisPubSub({
  connection: {
    host: REDIS_DOMAIN_NAME,
    port: PORT_NUMBER,
    retry_strategy: options => {
      // reconnect after upto 3000 milis
      return Math.max(options.attempt * 100, 3000);
    }
  }
});

Docs on the different Redis options can be found here.

If you’d like more detailed instructions on the process, read on!

Setting up a Redis server for development

There are a couple of ways to set up a Redis server:

After installing and running the server, make sure to run the following command:

redis-cli monitor

This will monitor your server so you can see the subscriptions and publishes arrive on the server.

What makes GraphQL subscriptions special?

Let’s look at a GraphQL subscription request:

subscription monkeysLocation($limit: Int!) {
  flyingMonkeysMoved($limit) {
    location {
      lat
      lon 
      height 
    }
    movementVector
  }
}

This request contains three parts:

  1. Subscription operation name — this can be anything you like, and is mostly helpful for debugging:
subscription monkeysLocation($limit: Int!) {

2. The root subscription field and parameters, often used for filtering publications:

flyingMonkeysMoved(limit: $limit) {

3. The GraphQL selection set, which specifies which fields we want to get in the result:

location {
  lat
  lon 
  height 
}
movementVector

All of this allows the client to get exactly the data that it needs.

That last part of the query is the actual difference between GraphQL subscriptions and the “regular” Pub Sub paradigm. This is also the part of GraphQL subscriptions that can easily become a bottleneck if you don’t pay attention, because GraphQL resolvers can run any async or sync job that you would like them to.

Keep the above point in mind when you write your own subscription query, because running a GraphQL query for each subscriber and each event could put some significant load on the server.

Does it scale?

To figure out how to scale subscriptions, I ran some benchmark test cases on SubscriptionManager directly with the goal of moving as much data as I could under a second. The actual numbers are quite impressive, but keep in mind that I ran it on my personal machine and there is no network latency involved.

Disclaimers:

  • All of those tests were run with no optimizations of Redis configuration, so it should not be taken as proof of the rates it can get to. Those kind of benchmarks could be found here.
  • Redis event rates are limited by the amount of data Redis can receive from a single client in a given time. This limitation could possibly be lifted with different configuration.
  • Benchmark tests are a very basic approximation and should be taken with a grain of salt. Because every system is different, they can’t be used to reliably predict the load a production system could handle.

The Tests

I wanted to check two main factors in my tests, one being the impact of the event size on the PubSub engine, and the second the impact of the query size. The measurement I used is events per second.

Impact of payload size (No query)

You can see in the graph above that the payload size of the event doesn’t affect the throughput of the in-memory implementation with EventEmitters. On the other hand, the Redis tests show a decrease due to the limit of bytes that can be sent in a given time by a single Redis client. The same will apply to any implementations that send around big event payloads.

Impact of query size

On the second graph you see the impact of query size on performance. Other than the two rather close instances of no query and one field query, the trend for both engines is slowly decreasing in the amount of events per second that we can handle. You should note that those queries were done only on the given event data and the subscription resolver was not doing any async calls.

Bottom Line

We validated that our single server subscription manager would not scale very well. While we could just add another subscriptions server to our setup without an external PubSub, this means that events are private to the same server the event was published to. By adding in Redis or any other external PubSub mechanism, you can allow for consistent propagation of published events to their subscribers.

Future Optimizations

Right now the Subscriptions Manager runs the GraphQL query for each event and each subscriber. A pretty simple PR could implement shared execution for subscribers with the same query — as long as the response is not user-specific.

The Apollo team is also getting ready to implement GraphQL subscriptions in some of their own production apps, which will provide some more on-the-ground information about the performance of the system.

Show me code!

Check out the full GitHunt example with Redis subscriptions here. This of course has been battle tested using 400 flying monkeys!“Because nothing proves performance like 400 flying monkeys.”

Updated Package! — Oct 8th, 2016

Recently, graphql-subscriptions package added a way to pass options to each call of subscribe. Those options are constructed in the setupFunctions object you provide the Subscription Manager constructor.

The reason for graphql-subscriptions to add that feature is to allow pub sub engines a way to reduce their subscription set using the best method of said engine.

For example, meteor’s live query could use mongo selector with arguments passed from the subscription, like the subscribed entity id.

For redis, this could be a bit more simplified, but much more generic. The standard for redis subscriptions is to use dot notations to make the subscription more specific. This is only the standard but I would like to present an example of creating a specific subscription using the channel options feature.

First, create a simple and generic trigger transform

const triggerTransform = (trigger, channelOptions) => 
                      [trigger, ...channelOptions.path].join('.');

Then, pass it to the RedisPubSub constructor.

const pubsub = new RedisPubSub({
  triggerTransform,
});

Lastly, provide a setupFunction for commentsAdded subscription field. It specifies one trigger called comments.added and passes the channelOptions object that holds repoName path fragment.

const subscriptionManager = new SubscriptionManager({
  schema,
  setupFunctions: {
    commentsAdded: (options, {repoName}) => ({
      'comments.added': {
        channelOptions: {path: [repoName]},
      },
    }),
  },
  pubsub,
});

When I call subscribe like so:

const query = `
  subscription X($repoName: String!) {
    comments.added(repoName: $repoName)
  }
`;const variables = {repoName: 'graphql-redis-subscriptions'};subscriptionManager.subscribe({
  query, 
  operationName: 'X', 
  variables, 
  callback
});

The subscription string that Redis will receive will be comments.added.graphql-redis-subscriptions.

This subscription string is much more specific and means filtering is not needed anymore for this case. This is one more step towards lifting the load off of the graphql api server.

Further Reading

  • You could think about how to redirect a subscription from one GraphQL server to another using the same pattern Omri Klinger suggests for Meteor in his talk.
  • Skim through my package and see how easy it is to implement the PubSub Engine interface. Meteor’s LiveQuery or ZMS Pub Sub engines are around the corner.
  • “@live” annotation is another suggestion for reactivity in GraphQL. Learn more on Sashko Stubailo’s post or in the talk by Lee Byron and Laney Kuenzel.

  • Special thanks to Jonas Helfer for the help on my thinking process for this package.
  • If this package is something you’d like to use in production or make changes to, be sure to star, add an issue, or submit a PR!

Written by

David Yahalomi

David Yahalomi

Read more by David Yahalomi