At the last meeting of the San Francisco Django Meetup Group, Manish Sinha gave a talk on migrating from REST to GraphQl in Django. His talk touches on what REST and GraphQL are and moves into how to migrate a Django app that serves REST to one that serves GraphQL. Specifically, he takes an existing application built using Django Rest Framework and converts it to use Graphene (https://github.com/graphql-python/graphene-django). How to write tests, performance concerns, and areas for improvement are also covered. Enjoy the video!
Video Transcript
Manish: Hi everybody. My name is Manish, and I will be talking about migrating in Django from a REST app to the app that supports GraphQL. And those are my contact details. Quick show of hands, who knows what GraphQL is? Okay. Quick show of hands, who has used GraphQL regardless of whatever the backend is on the client-side? Cool. Awesome.
These will be the goals for us today. I’m going to give a quick refresher on REST and GraphQL. I want to talk about how we read data using Django or using graphene which I’ll talk about that more, write data, authentication, authorization, performance, and documentation support. And no presentation in the [inaudible 0:01:01] will be complete without a pitch, so I will put a pitch at the end.
I’m going to make this a little interactive. I do a lot of training and teaching so I would love your help in all that. A little bit about who I am; I started my career in the Obama administration, went to Wall Street worked there, and then now I run a small consulting shop called North Star Labs where we train hires and we’ve made a software with mostly Django, for people on the screen.
REST, in the beginning of time there wasn’t REST. But REST came and we built so many apps and endpoints to exchange data. We did it using HTTP methods up here. We use that to signal whether we’re making an operation that changes data or reads data. Quick of show of hands, you’ve used Django REST framework before to perform all of these things? Perfect! Cool!
So we know [inaudible 0:02:15 song and dance/JSON] here. We use GET methods for endpoints where we want to read data. And in this example we’re reading a movie. We might want to retrieve a certain particular movie so we specify an ID in the URL path, we get something back. If we want to create a movie, we POST to our back-end. And we usually supply some sort of pay load in that POST in the body of that HTTP request. And the back-end does whatever it wants where it usually pushes it into the database and we get some sort of signal back, same with PUT and DELETE. We know that in REST we use http status codes to tell us whether the operation exists or not. So with REST, there is this concept of retrofitting it into HTTP, we’re leveraging status codes. We’re leveraging HTTP methods. This is a style of data transfer that was retrofitted into the HTTP protocol. That changes with GraphQL.
Quickly on Django REST framework, super active, super popular library, you should use it if you have it. The rest of this stuff will be graphene in GraphQL, but I do want to make a plot for that. A guy named Tom Christie runs the project; he’s personally answered questions I’ve had about Django REST framework. An Awesome guy. I think everyone in this room stands on the shoulder of giants in the past; he’s one of those giants. Please donate money to him. I’m not affiliated with the project in any way, but please support open-source.
Okay. GraphQL. So we know that there is queries, which are operations in GraphQL that read data so they don’t change anything. We know there’s mutations. These are operations that write data or change data. And then there’s subscriptions which I actually personally haven’t used, but my guess is it’s a way to use web-sockets over GraphQL, if I could take a guess, but I’m not sure. Can anyone in the audience give some pros and cons about GraphQL over REST? Why was GraphQL created?
AM: Too much data can be over the line or too little data can be to the line so avoid multiple requests, only one data that you don’t get back.
Manish: Over-fetching and under-fetching. In the REST world we have to create an endpoint for everything. And if we the get data back then we need to find more about, well guess what that means another request over the wire to that endpoint. Awesome! Thank you! Anybody else? Why would one use GraphQL? Yes
AM: To be able to get a snapshot of what things look like without having to look in a million places.
Manish: Yes. I will just twist that out a little bit. GraphQL is embedded in a schema, and so right now in the REST world what we do if we were talking to frontend developers, well, we spin up a swagger, a little instance or we do worse, we talk over slacking email, and that’s a huge pain in the ass, because were just talking about stuff that’s already in the code. Well, with GraphQL you tell friend in dev, hey, here’s an endpoint. Delivers an entire schema of all the queries and mutations and not only the parameters for those operations but the types for each of those parameters and the types of the data that comes back. So it cuts off a whole bunch of communication overhead and it kind of self-documents itself. Those are some reasons. You should learn more at home about why GraphQL is better than REST or its better in some cases. But that’s a preview.
A little bit more story time before we jump in. Graphene is the main, I would say, de-factor library that you’ll use if you operate in the Django world, and once you create a Django back-end that talks GraphQL. It was created by a guy named Syrus who just stepped down from the project. I will talk about that a little bit more at the end. He claims, I can’t validate this, but Graphene is used at Yelp, Reddit, and Mozilla. Based upon this screenshot I took this morning 19100 stars, 220 issues, red flag number one. 36 full requests, yellow flag number two.
Quick comparison. On the left I’m going to show how things work in Django REST framework land and on the right I’m going to talk about they work in Graphene Django. That doesn’t stand for God, it stands for Graphene Django. On the left, we have a model that’s called; I don’t know and then no one has a title on some text. Like a [inaudible 0:07:42] do that, let’s say. And we have a serializer at the top that we can hook in to our model that I’ve not shown. And then we have an API end-point that we use, this comes from REST framework API view. And we use to simply to retrieve an object by its key. So here we’re implementing an endpoint to retrieve a note object by its key, by its ID. And there’s a little bit of a helper method to get the object. So remember how we use in REST land, we communicate signals and statuses by the HTTP status codes. So notice this is 404 here. This by default passes a 200 status code. Does this make sense? Does the left side make sense to everybody in this room?
On the right is how we would do the same thing retrieving an object by its key in Graphene Django land. We define the type at the top because GraphQL mandates that all objects have a type. We connect it to a model, kind of like up here. I’m starting to see some similarities. We specify which fields we want to expose. Hey, that’s almost the same. And then down here, we say what we want to return and how to fetch it. Not particularly clear though to me anyways, this is the return type. So this is saying this query will return a note, a field of type note. Not super clear to me from first glance but that’s how they do it. A little bit of syntactic that I’m not a fun of; notes and resolve notes. So we have to prefix results underscore for any field that we want to retrieve. So I hope all of you are starting to build an intuition and an idea about some of the decisions about how Graphene Django was structured. I want to use this talk to kind of think about those decisions whether they’re good enough compared to Django REST framework.
Mutations; always know these are operations that change data. Same song and dance on the left. Though we make use of a serializer to validate the data coming in. So remember we’re making a POST operation to save a note object. Check whether the serializer is valid, return a response to that. This response – I can fiddle around with this as much as I want, and the client just has to deal with it. We are in REST land. No rules applied. I can stuff in a bajillion fields into those objects and the client has just to deal with it. This is not the case in GraphQL.
In Graphene Django you do the same thing with the object atop. At the bottom, notice we make a CREATE operation; CREATE mutation that takes in these two parameters. And once a mutate function – so this function is called and it simply makes an object. And in fact there is a small typo here, it should say notemodel.object.Create. That’s missing, sorry about that. So create the object and then return it. My return structure is hard-defined and hard-coded. I have to return one object of this type that will precisely contain those three fields, can’t deviate it from that.
Let’s talk about authentication. Trivia question especially for those of you studying for job interviews. Can someone tell me the difference between authentication and authorization?
AM: Authentication determines your identity, authorization determines what you can do.
Manish: Yes, authentication is are you who you say you are, and authorization - to reference your words - can this person do this or that. So, authentication on the left side; Django REST framework. Everybody knows settings that apply in here when you install REST Django framework, you set up some settings. I’ve short cut some words here just to fit everything on the slide. These are global settings applied to every end-point. Django REST framework offers a way to do that. It also offers a way to be a little more [inaudible 0:13:28] with applying different sorts of checks through the authentication classes field. Cool! Useful! I’ve used it. Super easy to follow!
Graphene Django. You can have one view that they give you that says a same schema you pass to this view, it’s going to inherit from the log and required mixing. Meaning, it’s going to require the default session token, default Django session id. That’s it. That is all they offer. Granular settings; those are third party support. But you might need to handle that yourself. Not great in my opinion and a huge oversight in my opinion. Authentication should be part of any sort of this frameworks and libraries, in my opinion. And the fact that they panted on it does not look good to me. But it’s doable.
Authorization; same deal. REST framework offers a way to apply something globally. It also offers a way to create your own permissions; easily, permission, message. And you could implement what determines whether something is late or not. And then you could create end-points that check against a permission. Super flexible! I have written code for clients who were aware of what Django permissions were and were just blown away by its ease and flexibility. Usually I create permission per role in the Django app; admin, super admin, general user. And then it’s a breeze to create these end-points and apply permission anywhere you want to. No such capabilities in Graphene Django and again some third party support, but not much. Again another huge oversight in my opinion.
Performance. I wish I had better slide here. I wish I could do an a/b test between Django and REST framework and Graphene with a huge data set and a nice pretty chart to give you some proof, but I don't. But it's not a good [inaudible 0:16:19] digital big red flag. There's an open issue on github. Some of this text might be hard to read, I'll just say that louder. I'll paraphrase it.
One person has 10,000 objects, and they are saying that it takes around 10 seconds to learn. So 10 seconds to return 10,000 objects; that is laughable! Not a person. A guy named Dan is saying it takes 14 seconds to return, from a query, on 35,000 objects. These are laughable sized small data sets, and we're talking about tens of seconds? Excuse me! That is insane!
On boarding is in a good state. If you want to spin up a Graphene Django project, for your own personal project, for an in-house project, for an internal work project, you can get going, it's there. The docs, already, they're pretty and they have the background which is good enough in my case. Advanced features. You’re going to be swimming and get issues on Stack Overflow, probably get all these issues. Examples, library is light. Not a lot of other examples out there.
Support. There is decent support on Stack Overflow and good coverage on github issues. If you ask a question you'll get an answer back, for sure. And there's new leadership which brings me to the pitch. So as of a couple weeks ago I became a contributor around Graphene. I can say that now. And you can't blame me for any of these problems yet because I’m a new contributor. And Syrus, the creator behind Graphene just stepped down. This is about a month ago. He's working on a new start-up. He built all of this mostly himself to a pretty good success. And there are a lot of questions swirling around the internet and in this room about the future and health of Graphene. And I'm proud to say we had a three hour meeting at Yelp, and there are a good two dozen volunteers who are down to carry the flag forward, and I'm hoping to be one of them on the Django side. I would encourage and invite all of you to get involved. I am pretty sure that this library and framework is not in a healthy state right now, it's probably obvious. But GraphQL, I strongly believe is the future and this library, this framework, is the de-facto entry point if we’re using Django. So it's only going to get more popular and it's only going to get you used more. There is a github project and there's a slack channel. There's zero pay involved and zero equity and this is best offer you all have ever seen. Any questions?
AM: I have a question. Is it really slow because of the addition of the framework around it or is it still slow [inaudible 0:20:08]?
Manish: A [Inaudible 0:20:11 comment] to this issue. It’s slow because of Graphene, specifically because there's a lot of [inaudible 0:20:18] form types happening. There's a lot of meta-class programming, which I don’t have a lot of experience with, but I think that sort of code is heavy.
AM: [Inaudible 0:20:35]
Munich: I would first check whether that app is used by a lot of people or not and whether it operates on a heavy data set. If the answer to both of those questions [inaudible 0:20:58], so like this is a side project of yours or an internal app that's not client facing, for example, what I would do is go code your swagger on your REST and pick out the queries first and start mapping out the types and implementing those end-points in the GraphQL side. And I would stop once you've gotten all your read operations. And I would check the health and the state of the app to see if it's okay, and then I would move on to mutations.
AM: Is Graphene currently able to do any optimizations like pre-fetching the Django ORM or is it actually do the [inaudible 0:21:48]?
Manish: I'm not 100% sure but I would lean in towards it does not do any sort of caching or use a resolver, and use it in the smart way. Any other questions?
AM: [Inaudible 0:22:13]
Manish: No. I personally use a [inaudible 0:22:25] in my React apps and it works beautifully. And I just want to emphasize if you're building projects that don't grow at scale, using Graphene is a breeze and it's good. It can get better but it's good. I might be painting a pretty dark picture. I'm painting a dark picture for scalability purposes. If you want something rock-solid and that's going to hit a lot of people, I would still go with REST framework.
AM: One of the reasons, like you mentioned, REST framework fits in nicely with Django is because REST is based on HTTP, Django is based around that so a lot of concepts are similar are can be transferred. And probably with GraphQL, that’s just not going to be the case. So do you imagine to get like authentication and authorization, permissions, the [inaudible 0:23:18 low-level] or whatever you have to reinvent the wheel a lot? Or do you think there is opportunity to use some of the building blocks of Django?
Munich: I don't think you'll be able to reuse some of the third-party libraries or even some of the things inside Django REST framework aside from the serializers, which I think work really well. I personally, a lot of my projects, use JWTs. And so there's a great third-party library that works with Grapheme and it handles all that for me. That being said I'm not going to go around recommending it because it's a pretty small project, and it's not battle-tested. I have no qualms about recommending REST framework-JWT to people because that's used everywhere. So that's my take on it. And there's [inaudible 0:24:12 PRs] out there and a lot of discussion about folding and authentication. So though the picture might be dark right now, there is a path to move forward.
AM: Just a comment on that. I have used Django mixings to add in authentication stuff to the GraphQL interface. So you can kind of lock people out there, [inaudible 0:24:35] their permissions based on their sign in.
Manish: Any other questions? Cool! Thank you!