GraphQL — What, Why and How

How to set up a basic GraphQL project using graphql-express, what are root types, and how to best structure the code in GraphQL projects.

Karthik Kalyanaraman
Bits and Pieces

--

In this post, I am going to explain what problems GraphQL solves, setting up of a basic GraphQL project using graphql-express, what are root types and how to best structure the code in GraphQL projects.

source: xkcd comics

What and Why?

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. — graphql.org

The key here is, GraphQL is a query language for “APIs”. Note the word “APIs”. It’s important to know that GraphQL is NOT a query language for Databases. This is a common misconception and there have been numerous debates about this on the internet and social media.

Ok, so what do you mean by a query language for APIs?

In order to understand this, let’s take a step back and understand how a client(webApp/mobile app) calls APIs to get information from a server. For the purpose of this explanation, let’s assume that these APIs are RESTful (the most common API pattern).

For instance, let’s say we are building a medium.com for remote workers. Our backend is a NodeJS server with a Postgres database deployed on some cloud. We define APIs so that our ReactJS client application can consume them. Nothing fancy so far. Let’s think about how these APIs might look like,

POST /api/post -> Creates a postPATCH /api/post -> Updates a post (edit, add claps etc.)POST /api/user -> Creates a userGET /api/users/:id -> Gets the user's metadata by idGET /api/posts/:id -> Gets the post by id

Again, pretty basic right? Now, let’s think a little bit about front end development for this project and what kinds of use cases we might encounter.

Tip: Publish and Reuse React Components with Bit

Publish reusable React components from any codebase to a single component hub. Use Bit to document, organize, and even keep track of component updates and usage.

Share components with your team to maximize code reuse, speed up delivery and build apps that scale.

Example: exploring React components published on Bit.dev

use-case #1: As a medium.com user, I need a page where I can see all my stories.

In order to achieve this, the front end needs to do the following:

  • GET /api/users/:id -> This will return the user metadata for the user which probably looks something like this,
{
id: "someid",
fname: "karthik",
lname: "kalyan",
posts: ["postid1", "postid2", "postid3"],
}
  • For each post id in the posts list, call GET /api/post/:id and post process the returned data to build a list that looks like:
[
{
id: postid1,
summary: "intro to graphql",
url: "mymedium.com/intro-to-graphql",
},
{
id: postid2,
summary: "intro to rest",
url: "mymedium.com/intro-to-rest",
},
{
id: postid3,
summary: "intro to sql",
url: "mymedium.com/intro-to-sql",
},
]

This approach works and we are able to gather the information needed to build this page. But, in order to do this, we made in total 4 API calls(1 to get the user meta data and 3 API calls to get post meta data for 3 posts.) and also we dropped the extra information that we received on the floor with the GET /api/post/:id endpoint which probably looked like:

{
id: "postid1",
content: "encoded/compressed content",
claps: 50,
summary: "intro to graphql",
title: "GraphQL"
comments: ["commentid1", commentid2"]
}

We just needed the post summary from this object for each one of these posts. Also, for a user with 100 posts, the client needs to make 100 calls, which is expensive and also the amount of information transferred between the database, server and client is much more than what the client needs.

One way to optimize this is to introduce a new endpoint that looks like,

GET /api/posts/summary/:user_id

As you have probably guessed already, this endpoint fetches only the information needed(with properly constructed SQL queries) to display the user’s summary page on the backend and returns only what the client needs.

With cleverly constructed APIs that fits the needs of the client with hyper-optimized DB queries, we can build fast APIs for this project. But, software is hard and things don’t always go according to the plan. We are all prone to blind spots and API design can often go wrong with ambitious projects like these. No matter, how well we think about API design, there is a good chance we will run into under-fetching and over-fetching (of data) problems. The solution with REST is mostly,

  • Adding new endpoints
  • Over loading existing endpoints.

GraphQL exists to solve this very problem with REST APIs.

How?

Now, coming to the question of how to implement a GraphQL solution and what options are available for me. Before we look at the list of available options/projects, let’s look into building a simple GraphQL solution ourselves.

For the purpose of this explanation, we are going to build an ExpressJS server that uses GraphQL as the API layer. Let’s get started,

So, what is going on here. I am deliberately starting off in an unknown land and going to try to work it backwards so that you get a full picture of this.

On lines 17–22, we are instantiating the familiar express app. But, on top of that, we are also adding a middleware called “graphqlHTTP” which has a bunch of properties. Our endpoint in this case is, http://localhost:4000/graphql. Let’s see what is going on in the browser:

graphiQL

We get a pretty interface which is nothing but a web based client for graphQL.

Before we go further, let’s explore a little bit into graphqlHTTP middleware and lines 6–17.

  • The graphqlHTTP middleware lets us create a GraphQL HTTP server with any HTTP web framework that supports connect styled middleware, including Connect itself, Express and Restify.

Now, let’s look at the properties to the middleware,

  • As you can see, when we write a simple query { hello }, we get back { “data”: { “hello”: “Hello !” } }. The query and data matches exactly one to one. Or in other words, we exactly got what we asked. If you think about, this is the exact problem GraphQL solves.
  • Now, in order for GraphQL to know what kind of data types are possible in a service, GraphQL has the concept of schemas. Every GraphQL service defines a set of types which completely describe the set of possible data you can query on that service. Then, when queries come in, they are validated and executed against that schema. This is what we have done in line 5.
  1. We created a schema using GraphQLSchema
  2. Inside the schema, we have a type called “query” which is created using GraphQLObjectType
  3. The “query” type’s name is “Query” and has a field called “hello”
  4. The “hello” field is a GraphQLString and has a resolver function which returns “Hello !”

hello is a query type which is a special type in GraphQL. All query types should have a resolver function that tells GraphQL how to resolve this query.

Query types can be called like this,

{
hello
}

or

query {
hello
}

Now, let’s think about our medium.com example. Let’s go ahead and define our graphQL types.

For the sake of this example, I have defined two variables, users and posts for storing the data instead of using a real database. On lines 26 and 35, I have defined the corresponding schema types for users and posts. These are basically wrappers around the actual structure using the graphQL APIs.

Notice carefully, how I have explicitly declared the types for each field as GraphQLString or GraphQLList(in the case of posts). Also, the posts field in userType is a GraphQLList of postType

As we know, users to posts is a one-to-many relationship. And, ideally we want our API to return the user object based on the id we pass. We would also like to populate our user data with the actual posts associated with the user and not just the post id. So, we need data that looks like,

{
"data": {
"user": {
"name": "karthik",
"posts": [
{
"id": "post1",
"summary": "Intro to graphql"
}
]
}
}
}

Let’s go ahead and write a queryType that resolves to the data that looks like the above code block.

Ok what’s going on here?

  • I have define da queryType called UserQuery.
  • It has two fields, post and user
  • post takes an id as argument to its resolver and returns the corresponding post from db
  • user takes an id as argument to its resolver and resolves not only the user, but also the data associated with all the posts associated with the user.

Let’s go ahead an run the query from the browser

graphiql — Example

As you can see, we asked the data of user with id, “user1” and we asked only for name and the posts. We got exactly what we asked for. Let’s go ahead and add id of the user too,

graphiql — Example

Again, we got what we asked for. Let’s pause for a moment and think about how we could have achieved the same without graphQL. In this example, the postgreSQL database has two tables that look like:

users| id     | name    |
| user1 | karthik |
| user2 | john |
posts| id | summary | user_id |
| post1 | ... | user1 |
| post2 | ... | user2 |

Notice how we map 1-to-many relationship between users and posts by having a user_id column in posts table. Each user can have more than one post. But, each post is created only by a single unique user.

Typically, in order to fetch a user object by it’s id along with all the related posts, we can write a SQL query that looks like,

SELECT users.id, users.name
FROM users
INNER JOIN posts
ON users.id = posts.user_id;

Or, we can access the same data by using an ORM sequelize which is nothing but a wrapper on top of low level SQL queries.

This query is wrapped under a function call which we may call as, getUserInformation(id). In a REST architecture, we will have an endpoint that looks like,

GET /api/users/:id

This end point will internally call the getUserInformation(id) function on the server to resolve the data and provide it back to the client.

Notice, the list of things we do to achieve the same with REST,

  • Write a SQL query or use an ORM to make API calls to the DB
  • Wrap it up using a function or a class on the server
  • Create and expose an endpoint that calls into this function

In REST, we follow the above 3 steps repeatedly for almost every endpoint.

This is approach is great and works really well. But, some of the downsides to this approach include,

  • There is a good chance the DB Wrapper APIs will blow up at some point with a lot of redundancy when the project grows and the team size grows
  • The DB Wrapper APIs are tied to the DB schema which means, we cannot trim down or alter DB tables with confidence
  • Number of API endpoints will also blow up if we try and limit the scope of operation for each API. On the other hand, if we overload existing API endpoints, we run in to the problem of over fetching and thereby, wasting bandwidth and increasing latency.

Although, it’s nice to have the freedom of writing specific SQL queries for specific needs,

With great freedom, comes great responsibility

“Ok! So, now you may be tempted to ask, how does GraphQL talk to the DB? Does it have a mechanism to translate GraphQL queries to SQL queries(for MySQL or PostGreSQL)? If so, am I transferring too much responsibility to a new technology without knowing exactly how it is going to impact my performance?”

All of these questions are very valid. The answer is yes and no. Let me elaborate.

Does GraphQL translate GraphQL queries to SQL queries?

The short answer is, it depends! It entirely depends on the GraphQL solution you adopt. For instance, in the case of medium.com for remote workers, we have written resolvers for user and post . If you notice carefully, resolvers do the heavy lifting of reading information from the database, slicing and dicing it and providing it back to the GraphQL server. So, in this case you have two options:

  • Write SQL queries, wrap it up with a function and call this API from the resolver.
  • Use an ORL like sequelize and call sequelize APIs inside the resolver.

As you can see, this is no different than what we do with REST. This gives you the control and freedom to talk to the DB and also layer it with a GraphQL stack on top to reap its benefits.

But, if you consider other options like,

These solutions are GraphQL engines that have a compiler that automatically learns your database schema and generates efficient queries to fetch data.

Again, it depends on the needs and requirements of the project and you alone can determine what works the best for you.

So far, we have seen how to read using GraphQL. But, there are 4 operations associated with any database.

C R U D -> Create, Read, Update and Delete

Let’s move on to Create, Update and Delete. Enter Mutations

Let me remind you again that Query is a special graphQL type. Similarly, Mutation is another special graphQL type.

Let’s go ahead and create a mutation that let’s us create a user and create a post for the user.

As you can see, it’s follows a very similar pattern as queryType. We have defined a new variable mutationType, that has two fields, createPost and createUser.

Again, the resolver is hand crafter with business logic that involves,

  • Generating a new unique id for the post
  • Creating a new post object and adding it to posts
  • Associating the id of the newly created post with the user using the user_id argument.
mutation — Example

As you can see, we are able to create a new post and associate it with the user. And as per our logic on line 6 where we tell our mutation, createPost to return a userType, we get the corresponding user object with up to date posts information after we create a new post.

It’s important to note that Mutation encompasses Create, Update and Delete as we control what the operation does inside the resolver function.

So far, we have seen the two main root types of GraphQL, Mutation and Query. I have shown how to create GraphQL types using GraphQLObjectType and other GraphQL types. And also, how to create schema using GraphQLSchema.

But, this code is very tightly coupled with the type definitions, resolvers and schema all defined together. There is another elegant approach to structuring a GraphQL projects.

Code Structuring

For keeping the schema separate and flexible, we can create a new file type called schema.graphql which looks like this,

schema.graphql

This file defines the types, Query, Post and User.

Now, let’s move our resolvers to a separate file too.

resolvers.js

As you can see, the query resolver we wrote initially has been modified and refactored to the resolver that we see on line 25. Also, notice carefully, we have defined a resolver for each field of the User type. And user resolver simply returns the ‘id’ argument down to its child types — id, name and posts. The child types can access the ‘id’ passed down from its parents through the root argument and thereby resolve it separately.

This way we can resolve each field separately. Structuring resolvers this way makes the code cleaner, debuggable and has clear separation of concerns.

Now, let’s look at how we can import the resolvers and type definitions from resolvers.js and schema.graphql into our app entry file, index.js

index.js

We make use of the “graphql-tools” (which is part of apollo-graphql) package that provides us with the makeExecutableSchema API. This API takes in typeDefs and resolvers as arguments and creates a schema that can be directly passed to the graphqlHTTP middleware.

The end result of this code restructuring gives us the exact same functionality with a code base that is much more flexible, re-usable and scalable.

Options

GraphQL has been implemented in various projects that cater to different stacks. Some of the popular ones include:

Conclusion

I hope you enjoyed reading and learning the basics of GraphQL and bootstrapping a GraphQL project from scratch. Feel free to leave comments/feedback if you have any questions. Thanks for your time!

Learn More

--

--