What is GraphQL?
“Graph Query Language” is an Application Programming Interface (API) specification invented by Facebook that serves as an efficient, declarative and flexible alternative to the REST API specification. Let’s take a look at a few pros and cons.
GraphQL vs. REST: What’s the difference?
In comparison to GraphQL, a REST API has a set of endpoints. Each endpoint for the given resource corresponds to a CRUD action (Create, Read, Update, Delete). For example, given a resource of books, our resourceful routing would include the following endpoints:
GET
/books
– render all books (Read)GET
/books/:id
– render a specific show a book by id (Read)POST
/books
– create a new book (Create)PUT
/PATCH
/books/:id
– update an existing book by id (Update)DELETE
to/books/:id
– delete an existing book by id (Delete)
What are the Shortcomings of REST?
Read endpoints in a REST API are structured to return a specific resource. For example, let’s take a page that displays details for a given book. Let’s say all we want to display on the page is the book’s title and author. The first step is to GET a book from /books/:id
. Because we are strictly following the REST API spec, this will return the entire book resource. This leads to several issues.
Over fetching
Our request returned significantly more data than just the book title. It also returned a summary, genre, tags, and a bunch of other data. We’ve fetched more data than we need, which means unnecessary data was transferred over the wire. This is inefficient and less than ideal.
Under fetching
Because authors are a separate resource from books, the book request only returns an author_id
. But we want the author’s name. One way to accomplish this would be to make a subsequent request to /authors/:id
for the author data. Now we’ve made 2 requests for the one page. On top of that, the subsequent author request will be over fetching, since it returns the whole author when all we want is the author’s name.
We could get around this by making the book details endpoint populate the author, but then we would be over fetching for cases where we don’t need the author data. We would also no longer be strictly following the REST spec because we are returning more than the specified resource.
Is there a solution?
There is no great solution for the over/under fetching problem in a REST architecture. Generally, applications end up breaking away from the strict REST spec in order to get around these shortcomings. For example, creating custom endpoints for specific use cases, or making resourceful endpoints populate certain data dynamically based on custom parameters. In both of these cases, we are no longer truly following the REST spec.
Once you start customizing endpoints to accommodate specific client needs, those endpoints become less flexible. Now every time the UI’s data requirements change, you will most likely need to update the endpoint response. This means that UI changes often require backend changes as well. It’s also common for these customizations to lack documentation, which makes maintenance more difficult. Additionally, the API boundary gets more complicated as you keep creating “new” endpoints to fulfill a specific need.
What are the advantages of GraphQL?
Flexibility
Unlike REST where we have a set of endpoints for each resource, there is only one endpoint exposed in a GraphQL API. This single endpoint is capable of returning any combination of resources, with only the data points requested. The issue of over/under fetching is resolved because you never get more or less than what you need. Less requests for the necessary data, and no more unnecessary data over the wire. Additionally, if the UI data requirements change, you most likely won’t need to make backend API changes because the endpoint allows for fetching any shape of data based on the defined schema. Instead of changing the endpoint’s response, all that’s necessary is to tweak the request to ask for different data.
Declarative
In a GraphQL API, you explicitly request the specific data you need, and you will get back exactly you requested. In contrast, with a request to a REST API endpoint you know you will get back the resource requested, but you don’t know what exactly will be contained in that resource. Are associations populated? Were fields added or removed from what we previously expected? Aside from looking at the API documentation, this is largely unknown. To make things worse, REST API documentation is prone to become outdated because maintaining documentation is hard. Maintaining documentation is infinitely easier with graphQL — its self-documenting! This is made possible due to its use of a strongly typed schema, which enables introspection.
You will not be able to request data that isn’t in the schema definition, and you will know by looking at the magically generated documentation what queries and mutations are allowed, how they can be used, what the data types are, and more.
Unifying
Often times complex systems require aggregating data across multiple services. It is obviously not ideal for a client to have to request data from multiple services to get the data it needs. GraphQL can be used to unify these systems together under one client-friendly, well documented API. The GraphQL resolvers can do the hard work of fetching data from various external services, but all the client has to do is send the standard queries it always would. If one of those backend services changes or is no longer necessary, the client doesn’t need to know. The GraphQL schema / queries won’t change; only the way the data is resolved changes.
Federation
As described above, a GraphQL API exposes all backend data data sources in a single, easy to query endpoint. However, as you start adding more data sources and adding complexity, you might decide to break the single GraphQL service into several composable services. This is possible with Schema Federation, which is a way to consolidate multiple GraphQL services into one. This allows you to have a much larger, more complex data graph that is made up of smaller, individually maintained GraphQL services.
If you’r looking for an introduction to Schema Federation, there are some great resources out there. Federation is an example of how to use it in Apollo.
What are the disadvantages?
While there are a lot of reasons to use GraphQL over REST, it is important to keep in mind that it is not a silver bullet. Using GraphQL will not simply solve all your problems around inefficient data fetching. GraphQL won’t, for example, make your application fast if you are making expensive database queries that take forever. Let’s discuss a few disadvantages to consider before you decide GraphQL is the solution for you.
Hidden Complexity
GraphQL is a great way to reduce complexity on the client, but it doesn’t really remove the complexity as much as it just moves it. While the end result of a GraphQL API is very easy for the client to consume, you need to consider the computation required to generate the data on the server. Even though the client doesn’t need to know about it, the work to fetch the data still needs to happen somewhere (which would be in the resolvers). It is important to understand that you still need to find efficient ways of handling complex data fetching, particularly nested associations. Check out this article by @andrewingram that goes in depth into GraphQL optimizations.
Expensive Queries
Due to the highly flexible nature of GraphQL, consumers have the ability to ask for data in a fashion that could easily overload your server/database/network. If you don’t have a way of identifying and preventing these types of queries intelligently, you will have a bad time. Some methods of dealing with these potentially dangerous queries are:
- Size Limiting – Prevent queries that are larger than a given size from being ran. This is the least useful approach in the real world. For one thing, the dangerously expensive queries could potentially still get through if they had small field names. Additionally, valid and inexpensive queries with extra long field names might be blocked unnecessarily.
- Query Whitelisting – Only allow whitelisted queries to run. This can work, but is not easy to maintain. You could opt to use a tool like persistgraphql to extract all queries from your client-side code for the whitelist, but this defeats the purpose of using GraphQL for its flexibility. Additionally, it doesn’t account for external API consumers who might want to use slightly different queries than what is whitelisted.
- Depth Limiting – Prevent queries that have a depth beyond a defined limit. Depth in this case refers to the amount of nesting. So, you might say that queries can only be nested up to 5 times. This can prevent some of the worst queries from going through. Check out graphql-depth-limit on Git; it is a great module to accomplish this in Node.
- Amount Limiting – Prevent queries from requesting more than a defined amount of a given object. This will prevent clients from asking for a million of a given object, which would be problematic for obvious reasons. One module you can use for this in Node is graphql-input-number.
- Query Cost Analysis – Some queries are problematically expensive even if they don’t exceed the defined depth and amount limits. To address this, you can analyze queries before they are ran to calculate their complexity. If they are too expensive, they don’t get run. This is significantly harder to implement than the above solutions, and should only be done if truly necessary.
Check out this article by Max Stoiber (@mxstbr) on securing your GraphQL API from malicious queries for a more in depth look into these methods.
API Versioning
Unlike REST, it is recommended to avoid versioning GraphQL APIs, and instead favor continuous evolution of the API using provided tools such as field deprecation. This is great in theory, but can result in breaking changes if you are not careful. For example, if certain fields are removed when some clients still expect them to exist, the client would have some broken functionality. It’s important to be aware of what changes are breaking changes, and be diligent about deprecating fields without removing them completely.
HTTP Caching
Unlike GraphQL, a REST API makes use of resource specific endpoints with HTTP semantics. This allows for leveraging native HTTP caching to avoid unnecessary refetching of data that is still fresh. GraphQL always uses the same endpoint regardless of the query, which native HTTP caching can’t support because there is no way to differentiate between the requests. It should be mentioned that despite what you might have heard, you can use GET requests in GraphQL, which actually can be cached unlike POST requests. That said, you will run into issues with larger queries, and every browser has different limits on this. Persisted queries can be a good candidate to get around this. It’s important to note that this limitation is specific to HTTP caching, not server side or application caching. Here’s a great article that goes more in depth on GraphQL caching.
Ready to get started?
If you’re convinced GraphQL is the solution for you, there are a few key components you should understand.
Schema
- Defines the structure and capabilities of the API.
- A collection of GraphQL Types.
- Serves as a contract between client and server.
Check out this GraphQL Schema Language cheat sheet.
Example Schema Types
type Author {
id: String!
name: String
books: [Book]
}
type Book {
id: String!
name: String
genre: String
author: Author
}
Queries
- How the client requests information from a GraphQL server.
- Can request as much or as little information as necessary.
- Allows for naturally querying nested information.
- All possible queries are defined by the Schema.
Example Query
{
books {
name
author {
name
books {
name
}
}
}
}
Example Query Response
{
"books": [
{
"name": "Harry Potter and the Sorcerer's Stone",
"author": {
"name": "J.K. Rowling",
"books": [
{
"name": "Harry Potter and the Sorcerer's Stone"
},
{
"name": "Harry Potter and the Chamber of Secrets"
},
{
"name": "Harry Potter and the Prisoner of Azkaban"
},
...
]
}
}
...
]
}
Mutations
- How to make changes to your data (create/update/delete).
- Similar syntax to queries.
- All possible mutations are defined by the Schema.
Example Mutation
mutation {
addBook(
name: "Harry Potter and the Half Blood Prince",
authorId: "1",
genre: "Fantasy"
) {
name
}
}
Example Mutation Response
{
"data": {
"addBook": {
"name": "Harry Potter and the Half Blood Prince"
}
}
}
Resolvers
- Query/Mutation payloads consist of a set of fields.
- Each field corresponds to a resolver function in the server.
- The resolver will fetch data for its corresponding field.
Example Resolver
NOTE: this example is in Node, using graphql-js. The specific syntax and implementation will depend on your tech stack and chosen GraphQL implementation.
const BookType = new GraphQLObjectType({
name: 'Book',
fields: () => ({
id: { type: GraphQLID },
name: { type: GraphQLString },
genre: { type: GraphQLString },
author: {
type: AuthorType,
resolve(parent, args){
return Author.findById(parent.authorId);
}
}
})
});
Aliases
Aliases are a good way to query to the same resource multiple times with different subsets of data. The following example lets us query for both used and unused without having to do any extra logic on the client. This enables us to hydrate an entire view by having GraphQL send us different bits and pieces in specific data sets. You shouldn’t do this all the time; use with caution.
Example Query using an alias
query {
// alias books to read
read: books(read: true) {
book { name }
}
// alias books to unread
unread: books(read: false) {
book { name }
}
}
Looking for more information?
Apollo
If you are working in Javascript land, I highly recommend looking into Apollo, one of the most popular GraphQL implementations in Javascript. Below are some resources for Apollo.
- Apollo Client (React) – a community-driven effort to build an easy-to-understand, flexible and powerful GraphQL client.
- Apollo Server – the best way to quickly build a production-ready, self-documenting API for GraphQL clients, using data from any source.
Other Links
You may also like:
If you love making great software, consider joining the Revelry team!
Apply to work with us!
We share our thoughts on development, product, and design on our Revelry blog every week.
We're building an AI-powered Product Operations Cloud, leveraging AI in almost every aspect of the software delivery lifecycle. Want to test drive it with us? Join the ProdOps party at ProdOps.ai.