Why and How to secure your GraphQL server

GraphQL is a powerful tool which can improve the development experience and performance of your frontends. GraphQL is a query language that allows clients to request exactly the data they need, in the shape they need it. Additionally, GraphQL has a typed schema which can be used with different tools (such as graphql-codegen) to improve code quality and catch type errors at compile time.

However, having the ability to query the data in the shape you need opens many doors for hackers to abuse this feature. In this post, we will focus on some possible ways to abuse a public-faced GraphQL endpoint and possible ways to solve these security issues.

GraphQL Denial of Service Attack (DoS)

With traditional REST endpoints, DoS attacks are mitigated by tracking the number of requests made by some IP and rate-limiting the incoming requests for that IP. However, since GraphQL let clients construct a query they want, rate-limiting only is not sufficient anymore. A hacker could just construct a heavy GraphQL query and send it as a single request to the server, by-passing the rate limiter. An example of a potential DoS query could look something like this:

query Dos {
   user {
     friends {
       user {
         friends {
          user {
             friends {
               user {
                 friends {
                      ## continue nesting...
                 }
               }
             }
           }        
         }
       }
     }
   }
}

Another option is to use aliasing:

{
    a: user
    b: user
    c: user
    # ...
    aa: user
    # ...
    zzzzzzzzzzzz: user
}

This problem becomes even more serious if the N + 1 problem is not handled correctly on the server.

Solution:

There are many solutions to this kind of problem, mostly focussed on analyzing the incoming operations:

Depth limiting: how deep can the query become? With this approach, the incoming operation is parsed and analyzed for depth, blocking if it is reached a certain threshold. However, parsing a query can already be a resource-heavy operation, especially when the query itself is really large. Additionally, some use cases just require a deeply nested query and having a global depth limit is just unpractical.
Max tokens: how many tokens like {, user can a query consist of. This is a better approach than the above since it does not require parsing the query, just calculating the number of tokens from the query raw string. However, like with depth-limiting, what is a good global token limit?
Operation complexity: this requires parsing the query and estimating the potential cost of the executed query. However, this is often hard to do right and the estimation can be far off. Additionally, you need to parse the query and have some global limit set which is not always practical nor performant.
Some additional solutions can be found here: https://escape.tech/graphql-armor/docs/category/plugins

Most of these solutions are focussing on restricting how someone can traverse the GraphQL schema. But in my opinion, in most cases, you should not give the ability to traverse the graph to some strangers at all. You should define your queries beforehand and allow only those to be executed. This approach is called whitelisting. However, whitelisting has its drawbacks, especially in reducing the development experience. Later in the article, we will discuss how to create a DX-friendly whitelist solution using cryptography.

Authentication & Authorization Vulnerabilities

Another security concern of using GraphQL is authentication and authorization. Since we allow clients to traverse a graph in any way, are we sure that all paths are correctly covered by the authentication and authorization logic?

An example of a well-known attack is a traversal attack:

query GetMe {
    me {
        id
        name
        friends {
            id
            name
            friends {
                id
                name
            }
        }
    }
}

Assuming that the business rule states that I only can see my friends and not friends of my friends, this query can and possibly will leak information. We can go even n levels deep to try to fetch all friends of friends of friends of...

Solution

Using operation field permissions: a way to specify a required permission per type (in our example a type is me, friends). However, this still leaves a backdoor open for traversal attacks.
Per type authentication + authorization: define per type, for every field who and what the client in some context can see. This feature is backed into Hasura's permission system which is great but can become hard to implement when creating your own GraphQL servers or using other solutions.

My question is again, should we allow users to traverse the graph in a way they want? In most cases, no, thus using a whitelist of all possible queries mitigates a lot of attack surfaces for this kind of problem.

Exposing your GraphQL schema

GraphQL allows a client to query information about a schema. This is done by using an introspection query. Having the ability to send an introspection query allows many tools to generate type-safe clients and do some type-checking. However, allowing anyone to execute a GraphQL introspection can be a problem. It allows anyone to see your whole schema and potentially find any security risks.

Solution

Disabling the introspection query, however, is not always sufficient. By default, GraphQL's backend has a feature for fields and operations suggestions. If you try to query a field but you have made a typo, GraphQL will attempt to suggest fields that are similar to the initial attempt. It is important to disable this feature as well.
Not allowing users to create their queries, aka, using whitelists

Why is whitelisting great but hard

As discussed above I hope that you understand why whitelisting is a good solution to improve the security of a GraphQL server. However, whitelisting comes at a cost, especially during development. Many current solutions allow you to have a whitelist(s) which live on a central server. This means that the server moderators and developers are in control of that list and are managing these. However a premise of GraphQL is that anyone, especially a developer of front-ends, can traverse the graph how they want. But with whitelisting, they first need approval, in the form of a new entry or update in the list, and then, they can make the query. This approach becomes less friendly when you work inside a large team, but for small teams (1-3 developers) this can be sufficient. Another approach is to use CI/CD to gather all the queries on the client and update the central whitelist with the entries. However, this can become a DevOps problem and additional support and infrastructure are required.

In my opinion, there is a better approach, and that is using cryptography. Assuming that the consumers of the GraphQL endpoint are almost always front-end clients which have a build step, like NextJS, React-Native, Flutter, Native iOS, etc, we can gather all the operations and generate a stable signature with some secret during the build. During runtime, the signature is sent additionally (possibly inside a header) with the query. On the server, we compare the signature against the incoming query and its signature and allow or disallow the request. This approach is similar when you are using Stripe webhooks and validating the incoming requests against the signature provided by Stripe to assure that the requests originate from Stripe (https://stripe.com/docs/webhooks/signatures).

An example of using Apollo client on the frontend and NodeJS server for the backend:

Client:

import { ApolloClient, createHttpLink, InMemoryCache } from '@apollo/client';
import { setContext } from '@apollo/client/link/context';
// This is generated
import OperationHashes from '@/__generated__/operations.json';

const httpLink = createHttpLink({
  uri: '/graphql',
});

const link = setContext(({ operationName }, { headers }) => {
  // return the headers to the context so httpLink can read them
  const hash = OperationHashes[operationName];
  if (!hash) {
    //throw
  }
  return {
    headers: {
      'x-operation-hash': hash,
    },
  };
});

const client = new ApolloClient({
  link: link.concat(httpLink),
  cache: new InMemoryCache(),
});

And on the server:

import { createHmac } from 'node:crypto';
import { printExecutableGraphQLDocument } from '@graphql-tools/documents';
import { parse } from 'graphql';

const hashHeader = req.headers['x-operation-hash'];
const query = req.body.query;
// using printExecutableGraphQLDocument from @graphql-tools/documents ensures we have a stable query string
const stableQuery = printExecutableGraphQLDocument(parse(query));
const expectedHash = createHmac('sha256', process.env.SIGNING_SECRET).update(stableQuery).digest('hex');

if (expectedHash !== hashHeader) {
  // reject the request
}

It is important to note that stringifying and signing the query should be stable and that the secret is not exposed to the public, otherwise, anyone could sign the queries.

Another solution is to use a BFF pattern like described in my previous blog post: https://ilijanl.hashnode.dev/how-to-secure-and-make-graphql-blazingly-fast, however, this requires additional infrastructure.

I wrote a simple plugin for graphql-codegen which generates signatures of all the operations in a stable way: https://github.com/ilijaNL/graphql-codegen-signed-operation

Checkout the example of integration: https://github.com/ilijaNL/graphql-codegen-signed-operation/tree/main/example

Conclusion

In this article, we discussed several GraphQL security issues and possible solutions. Furthermore, we discussed why whitelisting is a good solution to improve security and how to make it more development-friendly using cryptography.

Additional resources to read about GraphQL security and solutions:
https://wundergraph.com/blog/the_complete_graphql_security_guide_fixing_the_13_most_common_graphql_vulnerabilities_to_make_your_api_production_ready

https://blog.yeswehack.com/yeswerhackers/how-exploit-graphql-endpoint-bug-bounty/

https://www.apollographql.com/blog/graphql/security/9-ways-to-secure-your-graphql-api-security-checklist/