How to design a kick-ass GraphQL schema
Adam Hannigan
Engineering Team LeadThis article will provide some practical tips that will help you design an intuitive, scalable and powerful GraphQL schema.
What is a GraphQL schema?
A schema is a structural representation of a product domain. It describes the key concepts of your product, the relations between these concepts and the core actions your system supports.
GraphQL is simply a tool that lets us interact with the schemas we define in an easy to use syntax. It does not enforce any standards around where the data comes from or the way we define our schema.
Because there are no enforced guidelines, good schema design is often forgotten about until it is too late. This leads to schemas that are hard to understand, difficult to maintain and near impossible to scale as new features are introduced.
Product Driven
A significant advantage of GraphQL is that it lets you create an API that is intuitive to engineers and product teams. A GraphQL schema should reveal the items, fields and actions that your end-users will interact with.
Databases are structured and designed in a highly technical and performant manner. GraphQL lets us simplify these structures into items and actions that more closely reflect the nature of our product.
âFirst of all we have to be experts at our domain ... Second of all we have to be good at GraphQLâ âGraphQL Schema Design @ Scale (Marc-AndrĂ© Giroux)
Shopify has standardised their schema design in a friendly readme. The main takeaway is that when designing a schema, the API does not need to directly model the user interface, âthe implementation and the UI can both be used for inspiration and input into your API design, but the final driver of your decisions must always be the business domain.â
Why?
- Easier to understand a product than a complicated data architecture
- Helps promote a common product language that engineers, designers and all stakeholders can use to discuss and iterate on complex concepts.
Example
Imagine we are creating a system that lets employees submit leave requests in their company. The main items in our product would be the Employees
, the Admins
and the Leave Requests
. The main actions would be requesting
and approving
a leave request.
Below is the database table that was created to represent users in our system.
user_db_table |
---|
id (INTEGER) |
is_admin (BOOLEAN) |
e_id (VARCHAR) |
full_name (VARCHAR) |
Bad Schema Design
Good Schema Design
By splitting the table into 2 concepts, we have correctly encapsulated the behaviours and fields for each type of User. Instead of passing an argument is_admin
every time we want to fetch users, we can easily interact with Employees
and Admins
when implementing features.
Additionally, we have abstracted away the e_id
column into a descriptive field that is only associated to an Employee
. This prevents confusion around what the field is and also indicates to our engineers that this field is only used for Employees
.
Think carefully about what you want to expose
âIt is easier to add a field than it is to remove a fieldâ â Rule #4 Shopify Schema standards
It is always better to hand craft a schema to ensure you are creating a usable API. Tools that generate a schema from a database are tempting but should be avoided as they act as a thin middleware that does not add any product value to our underlying structure. Think carefully before you add a field or entity, the more we expose, the less focussed a schema becomes and the more confusing it becomes for our engineers.
Michael Watson suggests slowly evolving your graph based on the clients use case in âThe Doâs and Donâts for your schema and GraphQL operationsâ. The main takeaway here is to ensure your resolvers and each field has a use case before you add it to your schema.
Simplify complex structures
While designing and scaling databases we often end up with a spider-web of tables and relationships from multiple different data sources. These structures are needed for optimisation and implementation of complicated features.
However, this leads to headaches for new engineers and confused product managers. GraphQL lets us abstract the underlying architecture into a friendly API that engineers can understand, interact and implement features with sooner.
A good rule to follow is that a resolver should not expose the underlying data source and should reflect a single concept in the product.
Aggregate fields on the server side
Where possible, perform complex calculations server side and expose them as a value within the product. This helps us reuse the logic, avoids âclient consumers having to manipulate the dataâ (Apollo) and simplifies the cognitive load on the front end developer.
Example
In our HR company, leave requests are considered accepted when they meet a range of criteria: An admin has manually approved it, the type of leave is medical or the date requested is more than 6 months in the future.
Below is what a typical GraphQL response would look like.
Bad Output Design
The problem here is that we have to calculate whether a Leave Request is approved every time we use this in our UI â a business rule should live in a central place, not scattered throughout the code base. This adds to the complexity and each engineer must understand the exact conditions in which a leave request becomes approved.
Good Output Design
We now reuse this logic and it is much simpler to interact with a single isApproved
field.
Note â if your product also exposes the individual fields to the users, supply these in your schema.
Create a schema that can be configured easily
âBuild APIs that stand the test of timeâ â Github
The key message here is to design a schema that lets you easily add features and seamlessly deprecate areas of your product. This is especially important within agile development. We need to iterate quickly around the customer needs and deliver value as soon as possible.
Concrete database structures are difficult to iterate as they require complicated migrations. Luckily, GraphQL schemas are more malleable and when designed correctly are simple to configure.
ALWAYS use a single input object for a mutation
This makes it much easier to add fields and also makes it easier to deprecate fields.
For this example, we will create a mutation that lets an Employee
request
a LeaveRequest
.
Bad Output Design
This is the approach I was guilty of when first creating mutations. The problem here is that it makes it very difficult to add and remove functionality to the mutation.
If we wanted to add functionality such as an image attachment, the only way we could achieve this is by adding a 5th argument. This clearly does not scale well and is difficult to maintain.
Using a single argument also makes execution easier on the client-side.
Good Output Design
This pattern allows us to add fields to the input object without introducing any breaking changes.
If we wanted to deprecate the description
field, we could make the field Nullable, add a deprecation reason and phase it out of our front-end code.
You will always want to return a single output object, not a value
â When working with mutations it is considered good design to return mutated records as a result of the mutation. This allows us to update the state on the frontend accordingly and keep things consistentâ â Atheros
For similar reasons, by returning an object in the response, it is easier to add and remove fields to that response object. Our clients can continue using these endpoints without breaking changes when we want to extend functionality and add new fields to the response.
Using the above example, we will also return the new Leave Request so we can update our UI without having to perform a second request.
Good Output Design
Map mutations in GraphQL to a specific user flow
Mapping our mutations to actions in our product ensures that we create smaller and more focussed requests.
âAvoid trying to build âOne Size Fits Allâ API that supports mobile, desktop and all features. Embrace different use cases and clients and build around that.â â Github.
Why
- By making smaller, intuitive actions, we make it easier to understand & reason about what a specific endpoint does
- Less code breaks â if less endpoints are using the same generic mutation, we reduce the impact of our changes
- Steers us towards good architectural patterns, especially âSingle Responsibilityâ
Anemic Design
Anemic Design is an anti-pattern where you design your system as purely data, without any behaviour built in. Simply put, it means that when you want to change some underlying state, you interact directly with the data layer using generic create, read, update and delete methods. In Anemic Design, business rules and behaviours exist, but they live inside the engineerâs brain.
For this example, an Admin
wants to approve a LeaveRequest
Bad Output Design
Why should we avoid anemic design?
- You have to send the entire payload of what you need to update or create.
- Engineers need to understand the underlying data structure and the side effect of changing each field
- A single mutation has to cater for lots of different use cases
Good Output Design
The action is now more specific, there is less room for side effects and our logic can now become more focussed inside of the resolver.
Use consistent naming conventions
Shopify Rule #9 â Choose field names based on what makes sense, not based on the implementation or what the field is called in legacy APIs.
The main takeaway here is to use a standard that works for your team and to stick with it. Consistent naming lets your team instantly understand what a specific resolver or mutation does.
A common rule throughout the industry is to user verb first, noun second.
Final Note
When building consumer software, we want an API that reflects our product. GraphQL was built for the purpose to âgive clients the power to ask for exactly what they needâ and to make âit easier to evolve APIs over timeâ â graphql.org.
It is time to expand our knowledge beyond principles we learnt from REST and SOAP based endpoints. In order to create a kick-ass schema, think carefully about entities, fields and actions you want to expose.
Take the time to ensure your schema is flexible and is highly coupled to your business domain. Early forethought about your schema design will make your front-end engineers jobs a lot smoother, will help new starters onboard faster and will ensure quicker iterations on features that improve the lives of your users.
Article originally posted by Adam Hannigan on Medium.com