At the last meeting of the San Francisco Django Meetup Group, Wes Kendall gave a talk on how to make a bulletproof Django application by testing it with pytest. He leads us through the fundamentals of testing your Django application, going from basic unit tests to more complex situations of mocking out resources and testing full page rendering. He touches on some more advanced topics as well, such as adding continuous integration (CI), parallelizing your test suite for faster run times, and adding coverage reports to your tests. Check out the video below to watch the entire talk!
My name is Wes Kendall, I'm originally from Tennessee. Today I'm going to be talking about writing tests for django applications specifically using pytest which is a tool that I've grown to love over the standard Django test runner. Along with that I'm going to be talking about what it means to write what I'm calling a bulletproof django application with your test.I I have an example project that I put online and I think it's actually going to be easiest, there's going to be a lot of code examples and things like that, for people that want to follow along with the slides and maybe have trouble seeing this just go to this github repository (https://github.com/wesleykendall/pytest-django-tutorial) or you can type it in directly, and in the readme there is a link to the slides if you just want to follow along with the slides because I’m going to be clicking on some links in the slides with the code examples and things like that.
Before diving into this example app I kind of wanted to go over a high level what this talk is about. What I'm going to be talking about today is first of all writing tests for django applications and I'm also going to be talking about the basic django test runner and what are the differences between using that an using this tool and framework called pytest? Have people here used pytest for django testing? I'm going to be talking about my own personal ways that I go about testing Django Applications and then I'm going to walk through this application and kind of go over how I thought about testing it and then go over some examples. It's going to be fairly high level for the most part and the intention of this is that you can go to this code base here and you can run it locally, you can even deploy it on Heroku if you want and play around with it and do a deep dive yourself on all these different Pytest plugins and modules and various ways of testing we are going to be talking about. I'm keeping this talk to 35 minutes, I might have to rush through some of the last examples if this goes on too long.
Just a quick glossary of terms - I'm assuming some Django knowledge obviously, but when it comes to testing, a test is a piece of code that is written to assert that other code is working, so you’re writing code to run your code to verify if it works. The test runner is what gathers all these tests and runs them and reports if there are successes or failures. The test suite, there can be many types of test suites - these are the full collections of tests, whether they are integration tests or unit tests - I'll talk about what those mean in a little bit. Continuous integration often abbreviated as CI is a tool to run your tests automatically whenever your code changes such as if you do a commit or if you push. And a fixture is a term that's used in django often, usually meant to reference when you are loading up a fixtures of static data into the database. When we talk about fixtures in terms of testing its kind of a similar concept, a fixture is setting up stage for a test and this might be some state in your database such as a piece of dummy data or mocking out a resource or something like that. And as I mentioned earlier there are so many different types of tests and it seems like everyone I've ever talked to has different definitions for all different types. I’m going to be referencing three types of tests today - the first is unit tests. These are tests that test a single unit of executioner code so it might be a single funcion, it might be a single branch within a function, it might be a single statement that you're testing. Thats typically what I mean when I say unit test, I think other people might have different definitions. Integration tests are tests that test multiple interacting components or modules, so its exercising a lot of code at once down one specific path. For example, writing a test that renders your whole marketing side of your application it just verify that it renders, thats an example of an integration test. A smoke test is a very quick test suite that you can run when you are about to deploy for example, you can deploy, you can run a smoke test and it's going to exercise critical parts of your application so in might actually login and walk through your application for example just as a quick check to verify that things seem good and we can go ahead your QA or whatever other stuff you need to do before we deploy to our main production site.
To start, what is the native Django testing framework and how does that differ from PyTest? Django's native testing framework, it’s actually built of of Pythons unit testing module and framework and what this means is that since django natively is built off of unit tests, whenever you write Django test phase you are actually inheriting a test case class and that inherits the unit test test case. You all are probably familiar with at least sending a test case class with test methods and then you do self.assert equals self.assert is none, self.assert calls - different assertions to verify what's going on the code. These are just some very quick example of what a test case would look like, so lets say that your testing your add function here that adds A and B together, you might have one test method that verifies that when you add two positive numbers together it equals another positive number and just to test the sanity of your add function you might just throw in a test case that adds two negative number together just to be sure it's calculating as you'd expect. And then a more advanced test case, again using Django’s test case class, you might make a custom user model that is set to active at first and then you might want to test what if I deactivate this user, I should assert that the active flag is false. A very quick example where when these test methods run the setup method is called before each one of these tests to setup the data for those test methods. That’s kind of a quick overview of how these test classes work at a high level.
Django’s test case class, there's actually a lot going on underneath the hood that django does on top of what unit test framework does, so for example, every time you run this test case you’re performing basic addition, since you’re using Django’s default test case module and not their simple test case in setting up a transaction for the database, so that if you write any data to the database is going to push it all back so you have a clean slate for each testing method - its doing a lot of different stuff - so if you were to write a lot of test cases and they never even accessed the database you are wasting a lot of cycles and resources and your making your test suite run a lot longer. It does a lot of other things like that. If you write a test case class like this and you have a setup method, you are basically assuming that every single method inside of this class should be using this test data here because if it doesn't make use of this test data you are just wasting all these cycles and you are making data that you don't even need for your test, and you can get into a realm where you have a really sluggish test suite.This brings me to what exactly is wrong with the test case? There's nothing really wrong with it - it gets the job done - there's just a lot boilerplate going on, you're importing test case, you're making a class, you’re writing methods, you have to remember all these assert methods. It can be pretty cumbersome to always use the right assertion method. Along with that, using the correct test case, depending on what you are testing, so simple test case doesn’t set up the database and do all that overhead. Like what I mentioned earlier too, especially with the set up method, you always have to keep tests that are really using the exact same data bundled together in the same test class if you want to use setup the right way.
This is an example of our Django test converted to PyTest, so again Pytest is just a library you install and it provides you a test runner and it also provides you tools that you can use inside your test and one of the benefits of Pytest is just making your test ideally smaller, more readable, more functional, and being able to share a lot of this data. So the django examples here, we saw that earlier, the Pytest example, this is an example of, we have our addition function here and instead of writing two different test methods, pytest by default comes built in with a way to parametrize your tests so you can mark you test as parametrizing them over these parameters so we define a and b and the expected value here and so we’re saying, use two and two here for a and b and then use 4 for the expected value and then just run our test function and verify that it’s the expected value. This is really nice if you’re just running different inputs and outputs for a test you can loop over all them and you can run these individually. Along with that you notice we are using functions here, Pytest promotes the idea of using functions over classes, you can still use all the old school test case classes and it still works in PyTest but they promote the idea of just using functions , the simple test functions and the really nice thing about Pytest is that you used just a simple assert statement so there's no assert equal, assert greater than, you assert whatever you’re testing, we’re doing that quality check here, if you’re comparing dictionaries in this assert statement for example Pytest is going to handle everything under the hood, it’s going to show you a nice diff of everything that is different in your dictionaries if they are different, and you have to do is write assert - it’s really nice.
The more advanced test case, we have our old django test case over here that creates a test user and verifies some things - over here in Pytest we’re doing the exact same thing - we’re creating our custom user object and testing the deactivate here where we were deactivating the user and we’re asserting they are not active anymore, similar to over here we’re doing the same thing. You’ll notice some differences r here versus the first example, is we’re using Pytest mark Django BB and I’ll get into what this actually means later, because this marker is provided by a Django PyTest plugin, what it says is that this test needs to use the database so please set up a transaction for me, you’re being explicit in saying I need to use the database for this test and if you try to access the database and you don't mark it it’s going to complain. You’ll notice the syntax here, this user parameter being passed into the test, so we saw an example before of parameters from parametrize being passed in, we define these parameters up here, but if you ever see a parameter being passed into a test and it doesnt come from parametrize it means that it’s a Pytest fixture and as you can see here we find this user fixture at the very top. Pytest fixtures are really nice because it allows you to set up state and then pass along that state to your test function. At the end of every test it’s going to, since this inside of a transaction we don’t have to clean up this database row that we created, but PyTest fixtures allow you to yield some and then after the yield you can then tear down state if you need to do that. Pytest fixtures are really nice, you can set them up for different scopes so you can have fixtures set up at the very beginning of the test module, you can do all these really nice things with it.
Some of the core idioms and features of PyTest: we’re using simple assert statements, Pytest is going to do all the proper diffing for you, running test functions instead of trying to arbitrarily group things by test classes, and we’re using our reusable fixtures on top of that so your test can be very explicit- I’m using this state and I need this type of setup in order to do my test. Some of the other features PyTest has are markers if you need to categorize your test. Say this is an integration test, this is a slow running test - you can mark all your tests explicitly so that you can then run them differently for different reasons. One of the most powerful things about Pytest is the plugin system. When you install plugins like the pytest django plugin i mentioned, it makes available all these fixtures for you to use. What we’re going to go over to today are a lot of different types of pytest plugins and how that can make testing more fun and enjoyable and making your tests run faster and be more clear overall. You run tests by typing Pytest, you’ll see here, you just type PyTest, you can set up Django so that if you do type Django manage. PyTest it will use Pytest as the Test runner but there’s no need to do that. This is kind of what the output looks like just by default. I mentioned this before, the reason why PyTest in this example is working well with Django is because I’m using the PyTest Django plugin, by default Pytest is set up for pure Python tests, but if you use the PyTest Django plugin it pretty much provides the entire functionality that Django's test case does, so Django's test case gives you a test client, you can say self.client and do post git and all that stuff. Django's test case does database setup and teardown and all these other things, the Python Django plugin effectively recreates all the functionality by giving you fixtures to use.
I’ve given a very high level overview of what is Pytest, what are the differences, especially between the normal Django Test Runner using Pytest so what I want to do now is go over a real world example of a toy project I made and kind of talk about how I think about testing and how I go about testing a Django application. When I said the word “Bulletproof” that’s just a random loaded term, but what I meant by that when I’m talking about making a bulletproof application is that whenever you make a bulletproof application Ideally you have really high confidence that if you are refactoring this codebase or a random person who has never seen the codebase is refactoring it you have confidence that you’re not going to introduce a bunch of random failures, you can refactor things that will run the test sweep and kind of know whats breaking. Along with that you have really high confidence that if you do a Django upgrade or just a miscellaneous package upgrade that you’re not going to introduce random bugs into the system. Ideally your test suite is covering a lot of critical parts of the code and you have a lot of confidence that doing very minor things such as just upgrading libraries isn’t going to break everything. Another thing that I feel is important for a really well tested application is having critical end to end paths, like primary user experience, totally exercised by automated tests - like logging in. Having all that covered so that if you write changes you’re not going to just accidently break core features of the site. Outside of just writing tests, what I can a bulletproof application is one that is structured in such a way so that the complexity is separated and you can easily test different components and it just makes it overall easier to test and for those that have been writing tests for a while I think that a lot of people have noticed, along with myself, is that you write tests for a while you eventually start writing code so that it's easier to test. You’ll hopefully see a couple examples of that today.
Again, there is a complete Django project specific for this talk today, it’s at this URL if you want to go to it (https://pytest-django-tutorial.herokuapp.com) - there are instructions for setting up locally or even deploying remotely. It’s a meme creator app, so what it looks like is an extremely basic landing page, I’ve already registered for an account, it doesn't do anything fancy with email validation or anything like that, in practice it should validate emails for people to register. With the meme creator, you click create meme and it calls the image flip API - Image Flip is just this random service that has an API for grabbing the top memes - it grabs the top 15 memes here, so you can chose a meme, you just click on it and then you can write the upper and lower text. You just click create meme, this is going to hit the API again to create a meme and you’ll notice here it saves it in our database, so you can go back and view it later. It’s just a fun little app, and you know you can logout and do all that stuff. That’s the app and it’s structured into three primary Django Apps, so if we go to our apps folder here you’ll see there is a marketing page and literally all that is that really basic homepage that you are redirected to if you are not logged in. There’s the user app which just has basic user login registration, it pretty much just inherits a lot of Django’s built in user authentication views. The core app is this meme app which has one model and it saves the meme for the user, and the URL to the image that was created from the image flip API, and it also just stores the creation time so we can order it on the main page. You’ll also notice on the meme app we have a wrapper for the image flip api so that we’re not hard coding requests everywhere, it parses some results, it just makes it easier overall. That’s really the core structure of this application.
Given this relatively simple application, backing up and determining how should we go about testing this, the way that I personally go about testing things such as this is that first I usually determine some high value integration test that can be written - what I mean by high value is writing the lowest amount of code that is going to cover the most code, so writing the shortest, simplest test that's going to exercise the most functionality in your application and give you the most bang for your buck. Typically, after I do that, I will think about areas of the code that do a lot of branching and that are more complex, so you can think about for example, if you have a booking page on your app and it calculates the order for cost or the cost of an order and it does all this different you will probably want to write a lot of different unit test for that that will test all the different edge cases. Typically I will think about different areas of that that do a lot of branching logic and those are the areas I will try to cover in a lot of unit tests and iterate over all the possibilities. For examples, like our API wrapper parses the API results, it handles errors, that’s a case of something I would unit test here starting out. Then after that I typically think about areas of the applications that are completely out of my control and might even change and one good example of this is the image flow to the API itself - it’s not versioned at all so if for whatever reason they just change it one day, they change the schema of it, it would be nice if we had some test to at least let us know this API’s completely different, you really need to update your app. Finally after that, after getting the base things out of the way, I will test the more nuanced stuff, such as going to the meme page for the first time and making sure that if you have no memes it gives you a useful help message that says “Create a meme here” - just basic stuff like that.
I figured I would spend the rest of this time just going over the tests that are written on this app and I’m going to try to go over the basic ones first and they get a little more complicated quickly. The first one is high value integration test and this example of just basic page rendering, let’s just make sure that the marking side renders. The example for that test is here and this is what it looks like with Pytest, we have the test marketing site and you’ll notice here that it takes in client and as I mentioned earlier, if you ever see a PytTest test and it takes an argument and you don’t really see that argument defined anywhere it means that it’s a fixture, and this fixture may be defined by your internal project somewhere and a thing called a conf test file which allows you to share fixtures along all tests, or it might be defined by a third party plugin. In this case I know client is defined by the pytest Django plugin and it gives you what the django test case will give you when you reference self.client. Taking the client fixture here because you request to our website to see the marking page renders, we get the URL of the marketing page and then we call client at that URL and then we verify that it returns a successful 200 response. This is an additional check, we verify that our tagline is in the content of the marketing side. With four lines there we have really high value integration tests, we’re actually exercising so many components of the codebase just by doing this, we’re accessing, we’re exercising the URL patterns that we defined, the templates being rendered, the entire request and response cycle is being exercised by this so this is a really short and nice high value integration test.
Another high value integration test is access control. This app is using something called Django stronghold just to be sure that by default everyone of your views are protected. You actually have to explicitly mark use as public if they shouldn't’ be protected for only authenticated users. Just to verify these assumptions we can write some integration tests to make sure that our views are public, for example, these are the user registration and the use login views - we want to make sure that these are publicly accessible, we don't want these to be behind an authentication wall. We begin by asking the client here as a fixture and the view name as the parametrized thing coming in from PyTest parametrize, we’re reversing that view name, we’re getting that and asserting again that we can access this page, it returns a 200 success code. For the private version we assert this in a different file and it’s linked in the slide - we assert the test is 302 and the response URL is the login page, so we know we are being redirected to the login page, that's how we know that the views that should be private, like the meme creation views, are protected.
Another quick high value integration test homepage redirection. If you go to the root URL just the meme creator route URL, it should direct you to the marketing page if you are not logged in it should redirect you to your meme page if you are logged in. This test goes through that, testing redirecting to memes when you’re logged it, so we have a client again, I’m going to go over what this authenticated user is because it’s a special thing in this app, but again we grab the homepage for this example, since we are logged in we should be redirected to the memes page. Again, since this parameter is not defined here anywhere it’s a fixture and we’ve actually defined this fixture inside of our app it’s a special fixture defined by our meme creator and I mentioned something about conf test files earlier, you can make a conf test file in any folder, we have it in the root of our project now, and depending on where it’s at everything that's under that priority has access to these fixtures defined in this file. This is an old version but it has authenticate user, we defined authenticate user here in this comf test file and that means it's accessible by all of our tests that are under this folder. We need to take the Django client fixture because what we’re going to do is create a user, this thing that you are seeing here is a package called Django dynamic fixture and it makes randomized data for you except for the fields that you fill out, so it makes it really easy to just make quick models where you don’t really care about the other fields except for the fields you care about. You make a user, we set the password, we say randomly login the test client, so now we have an authenticated user. Again, the nice thing about this is that we can now use this fixture for any more tests we need to have an authenticated user first, it needs to do requests to the site, that are protected behind an authentication wall.
This test is the most complicated and I’m going to spend the most time talking about it. This is our last high value integration test. In this test the full meme creation flows, so what this is doing is it’s actually loading up the chose meme page as an authenticated user, it’s clicking on the first meme that it finds, and when I say click I meant it parses the html, finds the url and then it loads that url, so it takes you to the create meme page when it goes to the url and then it posts to that page, so it posts the two text fields and then it verifies that the meme is created. There is some additional set up for this because we want to make sure that when we test our meme creation flow as you remember it’s hitting the image flow api underneath the hood, it’s actually calling out to a seperate API. WE really don’t want to do that in our automated test, we don’t want to have it so that when we run this every time it has to hit this other API especially if that API is creating resources under the the hood, which image flip is making memes, you really don’t want to waste other resources and you want to make sure your test suite is predictable so that every single time you run it it should ideally have the same result for continuous integration types. The way that we get around this is that we take the wrapper that I made here for the image flip API and we use a thing called a mock. What a mock essentially is, it’s again another library previously Python 2 you had to install the library, Python 3 it’s part of the unit test modules, you can import mock from unit test. Mock allows you to patch out paths of code whether these paths are functions of if you’re patching out a method on an object, and what that allows you to do is override the behavior of these functions or methods, or even variables. You can patch it out and I can say, every single time they get memes is called here I’m going to return this array of results because in reality if I was calling get memes and I was hitting the API I would expect it to return a list that looks kind of like this. When you mock out something like this underneath the hood everything that calls it, it can actually call whatever parameters it wants and it’s going to track everytime we call it, it’s going to track every parameter that you’ve called it with. That's kind of dangerous because if this function signature changes one day and yor mock is still calling it with the previous parameters it will just let it pass. Autospec actually inspects the arguments of get memes here and every single time this mock is called it ensures that the function is called with proper signatures, otherwise it fails. It’s a nice additional safety method on top of your mocks.
We patch out these two endpoints along the one that gets the meme and the one that creates the meme, we make it return some dummy data, we grab the choose meme URL and then we use a library called beautiful soup which I’m also sending. We take it and we actually parse the result of the choose meme page and we use this as a selector find the very first meme link there. Then we load that URL for the choose meme page. Then we post to it, so we post our upper text and our lower text and then we assert that that takes us back to the meme page where we can see all of our glorious memes we've created. Here at the very end we assert to that we made an object in the database, a meme object and it has the proper parameters. Additionally you could assert that your mocks were called at the proper parameters, I typically do that too but I left it out of this test. Again this test is going through essentially your entire flow, it does have some stuff mocked out, like your API, but it’s covering a lot of different code paths and it’s another example of a high value integration test.
Like I mentioned when we were patching out the API wrapper there fully that time, and if you ever changed, if for whatever reason your API wrapper was wrong you wouldn’t catch it with that test, so this is the part where we start writing unit tests for more complicated stuff like the API wrapper. And if we look at just one of the examples of our API wrapper we can see whenever we call get memes here, this is the function we patched out in the previous test, it calls responses that get on the actual API endpoint, if the status code looks good, it loads up the jason and if the jason has a success parameter, which image flow gives you the success key, then we actually return the content we care about, otherwise we raised and image flow error, it’s just a custom error that we made. Like I said there is actually a lot of different branches for this. First of all the api endpoint might return a 500 so that would thrown an error, or it could return a 200 and image flow could just say an error happened. So those are different cases where an error would be raised and we want to make sure that at least our API wrapper here is doing the right thing upon different responses. An example of this that is linked, this is just one example of testing that get memes function and its testing with a successful response and choosing another pytest called responses which is actually a library developed by sentry, and what responses allows you to do it it allows you to add a fake response for an endpoint and allows you to even say what status should return and things like that. Here we've added this fake response, we hit our wrapper and then we verify that it parses this response properly. Again, we go over different failure scenarios, we verify that an exception is raised whenever a bad status is returned, like a 200 or a bad data entry and this is an example of just unit testing our api wrapper.
We’re still patching out stuff and that's just to make sure that our unit test suite runs the exact same way each time. Often times you will get to the point where you really want to verify that the thing that actually calls the api is doing the right thing and this comes down to the thing I was talking about earlier called smoke test where we want to do some quick verifications and assumptions about and verifying some assumptions that we make and so I have an example of some smoke tests here where, what if the image flow API suddenly just changes. In this test case we actually hit the real endpoint itself and we validate the response against this schema as this other python library for validating a dictionary structure against a schema. You’ll notice that we marked this one as a special type of test, what that means is that we can now just run our smoke test. I have a connection error because I haven’t successfully provisioned my image flow username and passwords, you need a real username a password to run these tests, but by default we have set this up though so the test suite does not run smoke tests by default.
I’m running out of time here and I have this last example. Big question is like where do I go now that I have this test suite, I have all these things tested, what should I do with it now, how should I integrate it into my development workflow? A tool that I love that I’m sure many of you are familiar with is Coverage. What Coverage allows you to do and there's a plugin called pytest cov, and it gives you this dash dash cov option, it will tell you every single line of code that your test suite covers, so if your test does not exercise an if branch or an elf branch it’s going to tell you during this coverage report. As you can see here we've written sufficient amount of tests such that every single branch of our our code is covered but it’s a nice way to at least kind of see what areas of your application are missed by your test suite. Some people like to even enforce a minimum amount of coverage, that seems to be a touchy topic for a lot of people, I think a lot of it depends on how long your test suite is and how it’s all structured. Another thing too, test suite setting a continuous integration so in this repository an example of using Circle CI for doing integration, and what it means is that every time we do a push, circleCI is going to run our test suite for us and give us a pass or a fail. You can even set up circle on things like that to do nightly tests, you can set it to run your smoke test every single night or do special things like that. Now that you have a test suite you want to think about optimizing it, so pytest has a great plugin called pytest xdist, I have all this linked, if you install this plugin it give you this dash in options, so the nice thing about plugins is they can actually add options to landmine, so Pytest xdist adds this dash in option so I can run my test suite in parallel. I just ran it with four processes right there, it actually ran slower because I have so few tests that the overhead of spinning up processes, make it so if you run a really large test you can get some pretty big wins out of just paralyzing your test suite and Pytest plugin just does it for you out of the box - it's really great.
I mentioned earlier about test data, I use Django dynamic fixture in this example, there is another great testing data firmware called factory boy, it allows you to make much more realistic testing data, so let’s say you have an address builder zip code, you can fake out all the data and make it more realistic instead of using randomly generated data which is what django dynamic fixture generates. Like I said, go play with the app, feel free to download it, run tests locally or deploy it and again, this is meant to be a high level overview of many different tools and techniques specifically of django testing of pytest.