In 2012 Robert C Martin released his article "Clean Architecture" which later became a controversial book of the same name- engineers still diverge between loving and hating his concepts.
At the last meeting of the San Francisco Django Meetup Group, Jair Vercosa gave a talk on clean architecture with Python in which he discusses how to implement clean architecture with python using one of its most popular frameworks. Some concepts are simplified to help make them more digestable and to allow the development process to be more enjoyable. Enjoy the video of the full talk below!
Video Transcript
Alright, thank you! Good evening, everyone. As you can see here my name is Jair. I was made in Brazil [inaudible 0:00:16]. So I am working as an Engineering Director at Carta. And just to let you know what we do, we have private and public companies as well as investors and other type of stakeholders to manage their equity online, and manage cap table and all the things that are around that such as investments and financing, valuations etc. For to do, what we're going to talk today is clean architecture. How many people here actually know or have heard about clean architecture before, can you raise your hand? Cool! And how many people have heard about solid principles or something like that? Cool, great! Cool!
So that might be interesting today. One thing that I'd like to add here is that when we talk about engineering, we are always excited about infrastructure, excited about the new technology or the new tool that we can use. And sometimes we actually forget about the application itself and what we are trying to build and deliver to our customers. So clean architecture helps you to focus more on your application, your system than all those things that are around your system.
So basic concepts; clean architecture is basically a group of rules that allow you to develop software, cleaner, and easier to maintain, to develop tests, and also to switch components as you go and as you grow. So [inaudible 0:02:01], so very important as well as a readability. When we take a look at the rules that we have for clean architecture, some of them might be interesting. Sorry for that picture. I lost my presentation and use things like that. So the picture is a little bit weird. But some of the rules that we can talk about here is the dependency rule, which means that high level components should always avoid depending on low level components. And by taking a look at this picture here, what we mean by low level components and high level components, is that the components in the middle are high level. Everything in the arch is basically low level.
What that means for you is that usually what happens is that all your components in the middle tends to change less the components that are on the arch. So whenever you need to switch a database, whenever you need to switch a framework, or whenever you need to change a controller or an end-point, that's the change that’s more likely to happen. Then you actually change the data model for users, for example.
So the other thing that we have here is the screening architecture. Again, everything that is in the middle is considered high level components and those things should spin the domain of your application. So when you see, here we were talking about food for example, your application should reflect the domain, the business of your company. And that's why this is kind of decoupled from everything else.
The other thing is database, web, frameworks are simple details. It doesn't matter where you were persisting the data, what matters is how you're persisting it, and how you’re guaranteeing the integrity and the validation of these data. And the cohesion and coupling. I think most of us have worked with OOP before and have heard about this. But just to be clear about what cohesion means. This is actually coupling. I don't know if you have watched this movie before but like this is very much like coupling. It's really hard to separate them. And the thing about cohesion, so I'm Brazilian, this for me is football because you play with the foot. And this thing actually is handball, which makes much more sense. It's more [inaudible 0:04:38] that; so just messing up a little with Americans team.
Coming back to the presentation, so that's actually how you differentiate that. And what you try to do in your applications to have high cohesion which means that things that you’re grouping together makes sense together, and they change for the same reason. And low coupling which means that you can point to different directions as you go.
Great! So, solid principles. That was why I asked you about this. Those are the principles where the cleanup architecture is based on. So single responsibility principle. I think everybody has heard about that before. And the way that you heard about it is that components should do only one thing or a function should do only one thing, or a micro-service should do one thing and do it well. I would actually say that single responsibility principle means that things that are grouped together should change for the same reason, which means that you may have a system that your CFO of the company cares about, so therefore things should be together, because they're going to change for the same reason.
When we take a look at the open/closed principle, your components, your systems, your applications, everything should be open for extension and last open for change. What that means is that when you are building your application, you try to keep those alternatives to extension but you don't want to change the underlying behavior. What you want to do is actually help people to add behavior to it but not change your behavior. This name that - I don't know if I can pronounce that, Liskov I think, it’s a substitution principle. That is interesting.
So whenever you have an abstract class or even a B class in your system and then you have a child like some classes inheriting from that other class, you should be able to substitute, and anyone that’s in your system. So substitute the father by the child. So whenever you need to use, let's suppose that you are running an application and you need to pass in an object that's actually an inheritance from the main object, that should be fine because they all have the same interface that shouldn't break your system.
And interface segregation; that also talks about the interface of your components. If you are thinking about cohesion that shouldn't be a problem for you because whenever you have clients consuming your classes or using your AIPs, they're going to use exactly what they need. They are not going to rely on a huge API because they don't need to. So you're going to build smaller components that have only what makes sense for that.
The last one; dependency inversion principle. It means that - going back to this feature here - everything in the middle shouldn't have any dependents. And everything on the arch can depend in the middle, which means that high-level components cannot depend on low-components. That's important because remember when I said that low-level components change very often. If you change low-level components, for some reason, and you have high level components depending on that, that means that you may have to change the high level components. But if it’s a high level thing that means that you have a lot of other things that depend on that thing. When you change that you’re going to start running in circles and you're going to have to change [inaudible 0:08:25] things in your system, and you don't want to do that.
So let's take a look at code, I think that's actually what is interesting. We're going to start with the entities layer, which is the encapsulation for Enterprise business rules. I'm going to show you some code here. This is a small application that I did just to present the concepts. So here I had my domain, which is auth. This is an application to authenticate credentials. So the auth is my domain and I have entities here. When I go to entities I see two entities right here. Credentials; those are like username, password, if it's active or not and… Can you all see that? Yeah, I think so. If it’s active or not and the ID of the user. That's the only thing that it has. You should notice as well that this is not a Django model. This is just a Python object. And here I have some behaviors that are expected for this credential. One of them is how to set a password. The other one is how to verify a password, and how to deactivate.
An important thing here - and that's why I have this double underscore here - is I don't want to expose the terms of this credentials to the rest of the system. I want you to use my interface to change these I’ve captured. Therefore I can guarantee validation. I can make sure that the integrity of the data is intact. That's why I'm doing this.
This entity is also very unlikely to change. There’s not much that you can add to a credential. A credential is just basically it. If you were talking about using first name, last name, this is actually User Profile, it's not a credential. When we go to encryptor, same thing. Like this is just encrypt whatever I pass in, and that's very unlikely to change.
There's another component that I'm going to talk about real quick here, which is password. I consider this what we call object values, and this is all meant to be for passwords. And the reason why this is a different object, when you see the concept between object values and entities is that entities actually represent your business. and when you see two different entities you can actually define whether they are equal or not. A password is actually just a value. So if you don't have the password attaching to any credential, it doesn't make sense to compare them.
So, what we have here is actually this structure. We have credentials that depends on the password that depends on an encryptor, but all are together in the entities level which is a high-level components. You're going to see now that this thing doesn't depend on anything else. Use cases: this is actually an interesting layer that I like a lot. The use cases are basically the flows of your system. and when you think about what are you going to implement and where do you want to put your business logic, I consider that this is the best place for you to put your business logic. And the reason why is because you can test it better. You need to rely on the framework. You can lead to touch the database you can do whatever you want. And unit testing becomes really simple.
So let's take a look at the code real quick. Here we have use cases; and then I have a use case which is create credential. The create credential receives a repository. I'm going to talk about this in a minute, and it performs an action here. First, it checks if the credential already exists. If so it raises an exception, if not it's going to create a credential, an object, and pass it to the repository to persist the data. The great thing about this thing here is that I don't care about where I’m persisting data. I can, today, persist the data in the Postgres, tomorrow persist in [inaudible 0:13:01], and later persist in…I don’t know, any other database like MySQL or whatever. It doesn't matter. My use case is completely free of that.
So let's move back here. What I have here is the credential. And remember when we talked about open/closed principle. This is being open. I'm providing an interface for other systems that want to talk to me to implement that, and therefore I'm open for you to add behaviors as you go but I'm not changing how I behave.
So let's go back to the code. Let’s take a look on the next layer which is interface adapters. That's actually how your code, your business domain starts to talk to the external world. That’s basically how you make this transition, this bridge. And then take a look in the code here. Remember the interface that we created for repositories. If we take a look on the infrastructure side, we are going to find an adapter for Django. This is a repository that's going to get the credential and persist data using the Django model. So here, this is an implementation of that interface I shared. If I want to implement that with SQLAlchemy, I can do that pretty easy. Like, it doesn't matter. As long as I have those methods here in place, my use case will work perfectly. So that's part of this bridge between the framework now and in my business domain.
The final... Yes, so here's what I have. I have ad service that I’m going to… I didn't show it but I could show later. And I have a credential repository here that implements this interface. The last layer is infrastructure. Here is where Django lives. I'm going to show you real quick, right here, infrastructure. Here I have a Django application, as you can see right here. And in accounts, I have in my models. [Inaudible 0:15:47]. I'm relying on Django model and I'm just adding the new ID.
I have my views that are going to receive a request, and I have forms that we will validate input. An the interesting thing about views and the way that I like to approach Django is basically a model is just for you to persist the data, which implements the active [inaudible 0:16:21]. But the view in Django is basically to validate the input, delegate the process, get the output and return it back to the client. That's all I think a view should do, nothing else. So in this case I’m doing exactly that. I'm using the form to validate the input. Here I'm not validating if the password is strong or not, that's the business logic, but I'm validating if password match with the confirmed password. I'm just validating the good, I’m not not validated its content.
When you take a look at the view, what happens is that I get the validation from the form. If it's valid, I just perform the operation using the servers and then gets the output and sends it back to the clients. This is very de-coupled because now I have, again remember I told you infrastructure lives in the arch, right? So the infrastructure can change. It doesn’t matter if I’m using [inaudible 0:17:31], Django, or wherever. As long as we consuming the servers and passing the data there, I'm fine. And as long as I provide, in the servers, some kind of repository, I’m also fine.
So let's take a look at some of the… Define architecture with this. We have RegisterView which is at the top. We have the UserAccount model. And as you can see the layer in green is actually the bridge for my system in the bottom. And when you take a look those are the layers. We have entities, use cases, interface adapters, and infrastructure at the top.
So if you go to localhost, register. If you try put it any credentials and arrive to this, it’s just going to tell me that password is a match, which is input validation. If the passwords match, it’s going to tell me invalid email. That's a business validation. So we can take a look here on the use case. So you go the credential and you’re going to see that credential already exists. So that's a validation, business, that lives in the use case. If they have anything here…Let me try this.
I'm creating problems to myself. Well you got the point.
[Laughter]
Why’s it so hard to type… One second.
There we go! Finally! [Laughter]
So we had all the validations here. So let me check, and let me take a look, let me show you one interesting thing about the validation. So we had the validations for credentials here. So we are just making sure that we are not adding a duplicate credential. In the entity, we have, in the password we have a validation for the type of the password. Like if it's a strong or not. That lives in the password which is the responsibility of the password object.
And I want to show you some benefits of this. The first thing is the separation of concerns. Each one of those layers have low coupling and high cohesion. Like on the entities, you only have entities validating the integrity of the data that you are trying to put in. In the use case, you have the business logic that actually perform the operations. When you take a look at the repositories and presenters like the bridge there, that's how you talk to the external world. So you don't rely on the external world, you rely on this this bridge. And lastly, you have this this layer that actually represents the external world. That's what gives you better testing, and they have some tests here that I could show you. If we go to auth and use case, for example. One interesting thing about use case is remember I told you I don't need to rely on the repository that comes from outside. I can write my own repository from my tests, and that's it. I just write from my use case run it that's and becomes really simple. That's what you need to execute, and that's it.
When I [inaudible 0:22:34] entities, that's also easy. That’s a credential here, there's no dependency. As you can see here, I only depend on myself and password. Besides that, I'm fine. I can test pretty easy, fast. And also if you are complaining about your test taking too long to run like when you build things that way, they’re going to be like this [snapping fingers]. Cool!
Flexibility: the thing about flexibility is that as you saw here is you build layers on top of layers. I can replace any one of those layers pretty easy. And I've talked about this here. it gives you a lot of flexibility as you scale your system. So let’s suppose that now we were talking to an external servers, no longer to a local module. That's not a problem as long as you pass a client in that implements the same interface, easy.
However for startups, we know that this is quite a lot of work. And I want to just give you some advice for you as you do your stuff. I know that you probably will walk away from here and say look I'm not going to do all those classes and those weird things that looks like Java. I hate Java. I understand that. I've done both and I'm not proud of it. So the thing is that okay fine, embrace the framework but act as it wasn't there. And what I mean by this is try to write adapters for things that you’re doing. I don't know if you have heard about the adapter pattern but it's basically created in the same layer on top of anything that you are consuming from the outside. Then you don't need to rely that heavily on what it's presenting to you.
And I always advise you to have methods to run your queries for ORM. And the reason why, I'm working a pretty big good base, and I can tell you right now that we have queries all over the place, and that's a nightmare to maintain. Every time you need to change the way that you consume data you have to go over the whole code base and change that. That’s not easy. But when you have methods, if everyone is relying on the methods, as long as the signature is the same, fine you change whatever you want.
Build independent modules. This is an interesting thing in regards to Django because it's so easy to add a foreign key from one module, [inaudible 0:25:10]. That's so easy to do. But I'm not sure if you want to do that. So that's why you focus on high cohesion between your Django apps. So if you have modules that need to depend on the each other, great, they live in the same app. But as soon as you have things depending on something else outside of that app, be careful, because at some point you might want to extract that out and put that in another place. That's going to be a nightmare as well. We've done that now and I can tell you that's not fun.
And the other thing is like delegate responsibilities but don't blindly trust it. What I mean by that is, again, write adapters for whatever you use. Then you better probably like be more secure moving forward. Always revisit your code. Please do that. At Carta we are five years old company. We started to revisit our code like two years ago and that was hard. If you iterate fast; like ship, iterate again, ship, iterate again, ship, iterate again, that’s going to be making a life way easier. And I think that's it. Oh, one more thing, we are hiring. If you want to do things like that, please come talk to me.