Demand Control
Protect your graph from high-cost GraphQL operations
What is demand control?
Demand control provides a way to secure your supergraph from overly complex operations, based on the IBM GraphQL Cost Directive specification.
Application clients can send overly costly operations that overload your supergraph infrastructure. These operations may be costly due to their complexity and/or their need for expensive resolvers. In either case, demand control can help you protect your infrastructure from these expensive operations. When your router receives a request, it calculates a cost for that operation. If the cost is greater than your configured maximum, the operation is rejected.
Calculating cost
When calculating the cost of an operation, the router sums the costs of the sub-requests that it plans to send to your subgraphs.
For each operation, the cost is the sum of its base cost plus the costs of its fields.
For each field, the cost is defined recursively as its own base cost plus the cost of its selections. In the IBM specification, this is called field cost.
The cost of each operation type:
| Mutation | Query | Subscription | |
|---|---|---|---|
| type | 10 | 0 | 0 |
The cost of each GraphQL element type, per operation type:
| Mutation | Query | Subscription | |
|---|---|---|---|
| Object | 1 | 1 | 1 |
| Interface | 1 | 1 | 1 |
| Union | 1 | 1 | 1 |
| Scalar | 0 | 0 | 0 |
| Enum | 0 | 0 | 0 |
Using these defaults, the following operation would have a cost of 4.
1query BookQuery {
2 book(id: 1) {
3 title
4 author {
5 name
6 }
7 publisher {
8 name
9 address {
10 zipCode
11 }
12 }
13 }
14}Example query's cost calculation
1 Query (0) + 1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (1) = 4 total costCustomizing cost
Since version 1.53.0, the router supports customizing the cost calculation with the @cost directive. The @cost directive has a single argument, weight, which overrides the default weights from the table above.
@cost directive differs from the IBM specification in that the weight argument is of type Int! instead of String!.Annotating your schema with the @cost directive customizes how the router scores operations. For example, imagine that the Address resolver for an example query is particularly expensive. We can annotate the schema with the @cost directive with a larger weight:
1type Query {
2 book(id: ID): Book
3}
4
5type Book {
6 title: String
7 author: Author
8 publisher: Publisher
9}
10
11type Author {
12 name: String
13}
14
15type Publisher {
16 name: String
17 address: Address
18}
19
20type Address
21 @cost(weight: 5) {
22 zipCode: Int!
23}This increases the cost of BookQuery from 4 to 8.
Example query's updated cost calculation
1 Query (0) + 1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (5) = 8 total costHandling list fields
During the static analysis phase of demand control, the router doesn't know the size of the list fields in a given query. It must use estimates for list sizes. The closer the estimated list size is to the actual list size for a field, the closer the estimated cost will be to the actual cost.
There are two ways to indicate the expected list sizes to the router:
Set the global maximum in your router configuration file (see Configuring demand control).
Use the Apollo Federation @listSize directive.
The @listSize directive supports field-level granularity in setting list size. By using its assumedSize argument, you can set a statically defined list size for a field. If you are using paging parameters which control the size of the list, use the slicingArguments argument.
Continuing with our example above, let's add two queryable fields. First, we will add a field which returns the top five best selling books:
1type Query {
2 book(id: ID): Book
3 bestsellers: [Book] @listSize(assumedSize: 5)
4}With this schema, the following query has a cost of 40:
1query BestsellersQuery {
2 bestsellers {
3 title
4 author {
5 name
6 }
7 publisher {
8 name
9 address {
10 zipCode
11 }
12 }
13 }
14}Cost of bestsellers query
1 Query (0) + 5 book objects (5 * (1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (5))) = 40 total costThe second field we will add is a paginated resolver. It returns the latest additions to the inventory:
1type Query {
2 book(id: ID): Book
3 bestsellers: [Book] @listSize(assumedSize: 5)
4 newestAdditions(after: ID, limit: Int!): [Book]
5 @listSize(slicingArguments: ["limit"])
6}The number of books returned by this resolver is determined by the limit argument.
1query NewestAdditions {
2 newestAdditions(limit: 3) {
3 title
4 author {
5 name
6 }
7 publisher {
8 name
9 address {
10 zipCode
11 }
12 }
13 }
14}The router will estimate the cost of this query as 24. If the limit was increased to 7, then the cost would increase to 56.
When requesting 3 books:
1 Query (0) + 3 book objects (3 * (1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (5))) = 24 total cost
When requesting 7 books:
1 Query (0) + 3 book objects (7 * (1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (5))) = 56 total costConfiguring demand control
To enable demand control in the router, configure the demand_control option in router.yaml:
1demand_control:
2 enabled: true
3 mode: measure
4 strategy:
5 static_estimated:
6 list_size: 10
7 max: 1000When demand_control is enabled, the router measures the cost of each operation and can enforce operation cost limits, based on additional configuration.
Customize demand_control with the following settings:
| Option | Valid values | Default value | Description |
|---|---|---|---|
enabled | boolean | false | Set true to measure operation costs or enforce operation cost limits. |
mode | measure, enforce | -- | - measure collects information about the cost of operations.- enforce rejects operations exceeding configured cost limits |
strategy | static_estimated | -- | static_estimated estimates the cost of an operation before it is sent to a subgraph |
static_estimated.list_size | integer | -- | The assumed maximum size of a list for fields that return lists. |
static_estimated.max | integer | -- | The maximum cost of an accepted operation. An operation with a higher cost than this is rejected. |
When enabling demand_control for the first time, set it to measure mode. This will allow you to observe the cost of your operations before setting your maximum cost.
Telemetry for demand control
You can define router telemetry to gather cost information and gain insights into the cost of operations sent to your router:
Generate histograms of operation costs by operation name, where the estimated cost is greater than an arbitrary value.
Attach cost information to spans.
Generate log messages whenever the cost delta between estimated and actual is greater than an arbitrary value.
Instruments
| Instrument | Description |
|---|---|
cost.actual | The actual cost of an operation, measured after execution. |
cost.estimated | The estimated cost of an operation before execution. |
cost.delta | The difference between the actual and estimated cost. |
Attributes
Attributes for cost can be applied to instruments, spans, and events—anywhere supergraph attributes are used.
| Attribute | Value | Description |
|---|---|---|
cost.actual | boolean | The actual cost of an operation, measured after execution. |
cost.estimated | boolean | The estimated cost of an operation before execution. |
cost.delta | boolean | The difference between the actual and estimated cost. |
cost.result | boolean | The return code of the cost calculation. COST_OK or an error code |
Selectors
Selectors for cost can be applied to instruments, spans, and events—anywhere supergraph attributes are used.
| Key | Value | Default | Description |
|---|---|---|---|
cost | estimated, actual, delta, result | The estimated, actual, or delta cost values, or the result string |
Examples
Example instrument
Enable a cost.estimated instrument with the cost.result attribute:
1telemetry:
2 instrumentation:
3 instruments:
4 supergraph:
5 cost.estimated:
6 attributes:
7 cost.result: true
8 graphql.operation.name: trueExample span
Enable the cost.estimated attribute on supergraph spans:
1telemetry:
2 instrumentation:
3 spans:
4 supergraph:
5 attributes:
6 cost.estimated: trueExample event
Log an error when cost.delta is greater than 1000:
1telemetry:
2 instrumentation:
3 events:
4 supergraph:
5 COST_DELTA_TOO_HIGH:
6 message: "cost delta high"
7 on: event_response
8 level: error
9 condition:
10 gt:
11 - cost: delta
12 - 1000
13 attributes:
14 graphql.operation.name: true
15 cost.delta: trueFiltering by cost result
In router telemetry, you can customize instruments that filter their output based on cost results.
For example, you can record the estimated cost when cost.result is COST_ESTIMATED_TOO_EXPENSIVE:
1telemetry:
2 instrumentation:
3 instruments:
4 supergraph:
5 # custom instrument
6 cost.rejected.operations:
7 type: histogram
8 value:
9 # Estimated cost is used to populate the histogram
10 cost: estimated
11 description: "Estimated cost per rejected operation."
12 unit: delta
13 condition:
14 eq:
15 # Only show rejected operations.
16 - cost: result
17 - "COST_ESTIMATED_TOO_EXPENSIVE"
18 attributes:
19 graphql.operation.name: true # Graphql operation name is added as an attributeConfiguring instrument output
When analyzing the costs of operations, if your histograms are not granular enough or don't cover a sufficient range, you can modify the views in your telemetry configuration:
1telemetry:
2 exporters:
3 metrics:
4 common:
5 views:
6 # Define a custom view because cost is different than the default latency-oriented view of OpenTelemetry
7 - name: cost.*
8 aggregation:
9 histogram:
10 buckets:
11 - 0
12 - 10
13 - 100
14 - 1000
15 - 10000
16 - 100000
17 - 1000000Example histogram of operation costs from a Prometheus endpoint
# TYPE cost_actual histogram
cost_actual_bucket{otel_scope_name="apollo/router",le="0"} 0
cost_actual_bucket{otel_scope_name="apollo/router",le="10"} 3
cost_actual_bucket{otel_scope_name="apollo/router",le="100"} 5
cost_actual_bucket{otel_scope_name="apollo/router",le="1000"} 11
cost_actual_bucket{otel_scope_name="apollo/router",le="10000"} 19
cost_actual_bucket{otel_scope_name="apollo/router",le="100000"} 20
cost_actual_bucket{otel_scope_name="apollo/router",le="1000000"} 20
cost_actual_bucket{otel_scope_name="apollo/router",le="+Inf"} 20
cost_actual_sum{otel_scope_name="apollo/router"} 1097
cost_actual_count{otel_scope_name="apollo/router"} 20
# TYPE cost_delta histogram
cost_delta_bucket{otel_scope_name="apollo/router",le="0"} 0
cost_delta_bucket{otel_scope_name="apollo/router",le="10"} 2
cost_delta_bucket{otel_scope_name="apollo/router",le="100"} 9
cost_delta_bucket{otel_scope_name="apollo/router",le="1000"} 7
cost_delta_bucket{otel_scope_name="apollo/router",le="10000"} 19
cost_delta_bucket{otel_scope_name="apollo/router",le="100000"} 20
cost_delta_bucket{otel_scope_name="apollo/router",le="1000000"} 20
cost_delta_bucket{otel_scope_name="apollo/router",le="+Inf"} 20
cost_delta_sum{otel_scope_name="apollo/router"} 21934
cost_delta_count{otel_scope_name="apollo/router"} 1
# TYPE cost_estimated histogram
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="0"} 0
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="10"} 5
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="100"} 5
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="1000"} 9
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="10000"} 11
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="100000"} 20
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="1000000"} 20
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="+Inf"} 20
cost_estimated_sum{cost_result="COST_OK",otel_scope_name="apollo/router"}
cost_estimated_count{cost_result="COST_OK",otel_scope_name="apollo/router"} 20An example chart of a histogram:
You can also chart the percentage of operations that would be allowed or rejected with the current configuration:
Accessing programmatically
You can programmatically access demand control cost data using Rhai scripts or Coprocessors. This can be useful for custom logging, decision making, or exposing cost data to clients.
Exposing cost in response headers
It's possible to expose cost information in the HTTP response payload returned to clients, which can be useful for debugging. This can be accomplished via a Rhai script on the supergraph_service hook:
1fn supergraph_service(service) {
2 service.map_response(|response| {
3 if response.is_primary() {
4 try {
5 // Get cost estimation values from context
6 let estimated_cost = response.context[Router.APOLLO_COST_ESTIMATED_KEY];
7 let actual_cost = response.context[Router.APOLLO_COST_ACTUAL_KEY];
8 let strategy = response.context[Router.APOLLO_COST_STRATEGY_KEY];
9 let result = response.context[Router.APOLLO_COST_RESULT_KEY];
10
11 // Add them as response headers
12 if estimated_cost != () {
13 response.headers["apollo-cost-estimate"] = estimated_cost.to_string();
14 }
15
16 if actual_cost != () {
17 response.headers["apollo-cost-actual"] = actual_cost.to_string();
18 }
19
20 if strategy != () {
21 response.headers["apollo-cost-strategy"] = strategy.to_string();
22 }
23
24 if result != () {
25 response.headers["apollo-cost-result"] = result.to_string();
26 }
27 } catch(err) {
28 log_debug(`Could not add cost headers: ${err}`);
29 }
30 }
31 });
32}