/
Launch Apollo Studio

Deploying with managed federation

Best practices


When rolling out changes to an implementing service, we recommend the following workflow:

  1. Confirm the backward compatibility of each change by running apollo service:check in your CI pipeline.

  2. Merge backward compatible changes that successfully pass schema checks.

  3. Deploy changes to the implementing service in your infrastructure.

  4. Wait until all replicas finish deploying.

  5. Call apollo service:push to update your managed federation configuration:

    ~$ apollo service:push --serviceName="launches" \
                          --serviceURL="http://launches-graphql.svc.cluster.local:4001/" \
                          --endpoint="http://localhost:4001/"
    
    ✔ Loading Apollo Project
    ✔ Uploading service to Apollo
    
    The 'registry' service for the 'space-explorer@current' graph was updated
    
    The gateway for the 'space-explorer@current' graph was updated with a new schema, composed from the updated 'launches' service
    
    id      graph             variant
    ──────  ────────────────  ───────
    az329e  space-explorer    current

What's the difference between the endpoint and serviceURL options?

The endpoint option sets the endpoint where your service's schema is fetched from during composition, whereas serviceURL sets the URL that the gateway communicates with at runtime. This separation is useful because implementing services should not be publicly accessible, so the endpoint might point to a locally running server or a file, whereas the serviceURL should be a URL accessible to the gateway.

Pushing configuration updates safely

Whenever possible, you should update your service configuration in a way that is backward compatible to avoid downtime. The best way to do this is to minimize the number of breaking changes you make to your schemas.

Additionally, call apollo service:push for an implementing service only after all replicas of that service are deployed. This ensures that resolvers are in place for all operations that are executable against your graph, and operations can't attempt to access fields that do not yet exist.

In the rare case where a configuration change is not backward compatible with your gateway's query planner, you should update your service registry before you deploy your implementing services.

Additionally, you should perform configuration updates that affect query planning separately from (and prior to) other changes. This helps avoid a scenario where the query planner generates queries that fail validation in downstream services or violate your resolvers.

Examples of this include:

  • Modifying @key, @requires, or @provides directives
  • Removing a type implementation from an interface

In general, always exercise caution when pushing configuration changes that affect your gateway's query planner, and consider how those changes will affect your other implementing services.

Example scenario

Let's say we define a Channel interface in one service, and we define types that implement Channel in two other services:

# Channel service
interface Channel @key(fields: "id") {
  id: ID!
}

# Web service
type WebChannel implements Channel @key(fields: "id") {
  id: ID!
  webHook: String!
}

# Email service
type EmailChannel implements Channel @key(fields: "id") {
  id: ID!
  emailAddress: String!
}

To safely remove the EmailChannel type from your schema:

  1. Perform a service:push of the email service that removes the EmailChannel type from its schema.
  2. Deploy a new version of the service that removes the EmailChannel type.

The first step causes the query planner to stop sending fragments ...on EmailChannel, which would fail validation if sent to a service that isn't aware of the type.

If the goal is not to remove the type altogether, but to only remove it from the interface, the process is similar. Instead of removing the EmailChannel type altogether, only remove the implements Channel addendum to the type definition. This is because the query planner expands queries to interfaces or unions into fragments on their implementing types.

For example, a query such as...

query FindChannel($id: ID!) {
  channel(id: $id) {
    id
  }
}

...generates two queries, one to each service, like so:

# Generated by the query planner
# To Email service

query {
  _entities(...) {
    ...on EmailChannel {
      id
    }
  }
}

# To Web Service
query {
  _entities(...) {
    ...on WebChannel {
      id
    }
  }
}

Currently, the gateway expands all interfaces into implementing types.

Removing an implementing service

To "de-register" an implementing service with Apollo Studio, call apollo service:delete:

This action cannot be reversed!

apollo service:delete --serviceName=products

The next time it polls, your gateway obtains an updated configuration that reflects the removed service.

Include the --variant option to remove an implementing service from a variant other than the default variant (current):

apollo service:delete --serviceName=products --variant=staging

Using variants to control rollout

With managed federation, you can control which version of your graph a fleet of gateways are using. In the majority of cases, rolling over all of your gateway instances to a new schema version is safe, assuming you've used schema checks to confirm that your changes are backward compatible.

However, changes at the gateway level might involve a variety of different updates, such as migrating entity ownership from one service to another. If your infrastructure requires a more advanced deployment process, you can use graph variants to manage different fleets of gateways running with different configurations.

Example

To configure a canary deployment, you might maintain two production graph variants in Apollo Studio, one named prod and the other named prod-canary. To deploy a change to an implementing service named launches, you might perform the following steps:

  1. Check the changes in launches against both prod and prod-canary:
    apollo service:check --variant=prod --serviceName=launches
    apollo service:check --variant=prod-canary --serviceName=launches
  2. Deploy your changes to the launches service in your production environment, without running apollo service:push.
    • This ensures that your production gateway's configuration is not updated yet.
  3. Update your prod-canary variant's registered schema, by running:
    apollo service:push --variant=prod-canary --serviceName=launches
    • If composition fails due to intermediate changes to the canary graph, the canary gateway's configuration will not be updated.
  4. Wait for health checks to pass against the canary and confirm that operation metrics appear as expected.
  5. After the canary is stable, roll out the changes to your production gateways:
    apollo service:push --variant=prod --serviceName=launches

If you associate metrics with variants as well, you can use Apollo Studio to verify a canary's performance before rolling out changes to the rest of the graph. You can also use variants to support a variety of other advanced deployment workflows, such as blue/green deployments.

Modifying query-planning logic

Treat migrations of your query-planning logic similarly to how you treat database migrations. Carefully consider the effects on downstream services as the query planner changes, and plan for "double reading" as appropriate.

Consider the following example of a Products service and a Reviews service:

# Products Service

type Product @key(fields: "upc") {
  upc: ID!
  nameLowerCase: String!
}

# Review Service

extend type Product @key(fields: "upc") {
  upc: ID! @external
  reviews: [Review]! @requires(fields: "nameLowercase")
  nameLowercase: String! @external
}

Let's say we want to deprecate the nameLowercase field and replace it with the name field, like so:

# Products Service

type Product @key(fields: "upc") {
  upc: ID!
  nameLowerCase: String! @deprecated
  name: String!
}

# Reviews Service

extend type Product @key(fields: "upc") {
  upc: ID! @external
  nameLowercase: String! @external
  name: String! @external
  reviews: [Review]! @requires(fields: "name")
}

To perform this migration in-place:

  1. Modify the Products service to add the new field. (As usual, first deploy all replicas, then use apollo service:push to push the new partial schema.)
  2. Deploy a new version of the Reviews service with a resolver that accepts either nameLowercase or name in the source object.
  3. Modify the Reviews service's schema in the registry so that it @requires(fields: "name").
  4. Deploy a new version of the Reviews service with a resolver that only accepts the name in its source object.

Alternatively, you can perform this operation with an atomic migration at the service level, by modifying the service's URL:

  1. Modify the Products service to add the name field (as usual, first deploy all replicas, then use apollo service:push to push the new partial schema).
  2. Deploy a new set of Reviews replicas to a new URL that reads from name.
  3. Register the Reviews service with the new URL and the partial schema changes above.

With this atomic strategy, the query planner resolves all outstanding requests to the old service that relied on nameLowercase with the old query-planning configuration, which @requires the nameLowercase field. All new requests are made to the new service (due to the change of URL) using the new query-planning configuration, which @requires the name field.

Reliability and security

The gateway's configuration is exposed through Google Cloud Storage (GCS). For each Studio graph API key with a permission level greater than Consumer, Apollo Studio provisions a public file that's accessible via the hash of that key. In the event that your configuration is inaccessible due to an outage in GCS, the gateway continues to serve the last fetched configuration.

If the Apollo Studio API experiences an outage, changes to your configuration will halt until the outage resolves, but your most recently published configuration for each variant is still accessible via GCS.

The service:push lifecycle

Whenever you call apollo service:push for an implementing service, it both updates that service's registered schema and updates the gateway's managed configuration. Because multiple services might be updated simultaneously, it's possible for changes to cause composition errors even if

Because the graph is dynamically changing, it's possible for composition errors to occur for a service:push even after a service:check has succeeded if other federated services changed in the interim. For this reason, updating a federated service will re-trigger composition in the cloud, ensuring that the federated services still compose to form a complete graph before provisioning the configuration. The workflow behind the scenes can be summed up as follows:

  1. The partial schema is uploaded to a secure location and indexed.
  2. The federated service is updated in the registry to point to the partial schema.
  3. All federated services are composed in the cloud to produce a new complete schema.
  4. If composition fails, the command exits and emits errors.
  5. If composition succeeds, the top-level configuration file is updated in-place to point to the updated set of federated services.

On the other side of the equation sits the gateway. The gateway is constantly listening for changes to the top-level configuration file. The location of the configuration file is guarded by using a hash of the API key, provisioned ahead of time so as not to affect reliability. The lifecycle of dynamic configuration updates is as follows:

  1. The gateway listens for updates to its configuration.
  2. On update, the gateway downloads the configuration for each implementing service in parallel.
  3. The gateway performs composition over the configuration to update query planning.
  4. The gateway continues to resolve in-flight requests with the previous configuration, while using the updated configuration for all new requests.
Edit on GitHub