Here at Yeti we love learning new things, so twice a month we open up the floor to presentations on areas of personal expertise. Topics range from the basics of inbound marketing, to design for startups, to tips for living and working out of a backpack! Our awesome developer, Kevin, recently gave this presentation on the importance of readability. Enjoy the video!
So, I'm Kevin O’Leary. I'm a developer at Yeti, and today we're going to be talking about the importance of readability, and readability more particularly in the domain of programming. So before we start, first thing to think about when it comes to programming is that there are actual languages. Language is in the same way that I'm communicating to you guys right now. And the paradigm for programming languages kind of started 100 years ago when we started creating machines that we’d be able to communicate to. And we had to communicate to them in a very different way than we communicated to ourselves because it was a lot more low-level similar to assembly languages. Or if you guys have heard of other low level languages, you're kind of manipulating more hardware-focused elements of machine than kind of higher level things that other people have created.
So when it comes to language, especially programming languages, as I was saying they used to be really hard to understand, they're more machine centric. Over time though, certain people had efforts, in 1950s especially, with the language FORTRAN which was created to make an effort to make things more kind of readable for the actual programmers themselves. So they can kind of communicate with the machine in a more efficient manner. And what that meant was the ability to also communicate with other developers in a more efficient manner as well. Similar to we all know English in here so we're able to communicate with each other. But the way it used to be was you'd have to have a really specific knowledge of a particular programming language or a particular system in order to actually be effective in it. So as I’ll get to it later that causes a lot of issues when it actually comes to developing software because there's a lot of knowledge you need to gain before you're effective.
And the other [inaudible 0:02:04] that's going to be reoccurring throughout this whole thing is that writing code is really not that bad compared to reading code. That’s my personal belief and that's what I've heard from a lot of people as well. It’s kind of… there's writing things and writing new things. And you can kind of create solutions to problems that really make sense to your head but they might not necessarily make sense to other people. So kind of establishing that shared language between developers is paramount to effective programming.
And one of the most important things that we have to do basically every single day, multiple times a day when we’re programming is actually naming things. There are a few different types of things that we normally name, some of those are called variables which kind of store data. So for example I have my name Kevin and that's storing, that uniquely refers to myself. And other things like methods or functions which are kind of the building blocks with most programming which allow you to actually communicate with those functions to each other.
And the importance of those functions with each other is that if those aren't really clear or well-known to the developer then we need to take a lot of time to actually try to understand what those mean. And that usually means diving deep into the source code or trying or just really wasting a lot of time and waste a lot of cycles kind of understanding stuff. So a good example for a primer for why variable naming is important is if we just have the word name, which we normally get most for our applications. We have users. We normally fetch data for them. If we just say name we already operate under certain assumptions. And for every one of us we could be operating under different assumptions right now. One of those could be that the name itself is referring to just the first name or the full name or some combination of those. And when there's ambiguity like this in the code then it sets us up for creating bugs in the code. It also makes it so we have to do as I was saying before some deeper diving to actually what's going on. So a better name for something like that would be firstName so we actually know exactly what it's referring to.
Another example which is a different data type is called a Boolean. Booleans are either kind of true or false, yes or no, that's kind of how we define them; so it's kind of easy to understand. And standing, for example, is something I'm doing now and you guys are sitting. So let's say on a user we have a standing variable. If you just look at that without context it's kind of hard to understand exactly what it’s referring to especially if you're in a larger ecosystem and there might be words kind of related to that. So a better way to kind of name this would be something like isStanding which kind of makes it a lot more declarative and makes it more… Basically just more like English in terms of how we actually think about things and process information and communicate with one another. And what that does is basically allows us to ramp up a lot quicker on some of the code. These are obviously very simple examples but as we’ll get to it later it applies to more complex ones as well.
So what I was just talking about is kind of declaration. So declarations as we know they’re just say things, very obviously. And that is something that we always try to aim for in our code where we're saying like I'm going to fetch this user and bring it back to [inaudible 0:05:40] state, and then display it for the front end. So when a user is using their app, they actually could see their face and they know what their name is and if they made a typo or anything. And this kind of philosophy was spawned back in the day when people kind of realized that hey this machine code really isn't cutting it. I create this whole thing but now we need to… Now somebody else is coming up with the project. And I have a bunch variable names and a bunch of paradigms that I have set up that don't really communicate exactly what's going on. And even though the business logic itself might be complex, the code itself makes it even more complex than it has to be.
So for example this is a function name that we have and it's called createUserAndFetchProjects. And why this is a good function name is because we know exactly what it is doing. Some critiques of this might be that it is very long, and usually you only have like 60 lines in your editor before you really kind of start losing the space on your screen. So people usually opt to create short names in this so it kind of appears on your screen. The problem though is if I’d just create this, call this function createUser and createUserAndFetchProjects, then all of a sudden we're introducing ambiguity to the system again.
And even though it might be fetching the projects as well, and that’s just kind of a side effect of what we're doing, after we kind of create our user and like go to their profile and get their projects, then we don't really know exactly what's going on. And that's hidden from the other developers that are going to come to this and they're going to make assumptions about how this code works. and then those assumptions are obviously going to be erroneous which causes problems. And again, this is another thing that I found where when we make assumptions about some of these names and what exactly their functionality is then we wind up putting other new things in the incorrect places which just makes the whole kind of code ecosystem itself a lot more complex and a lot harder to understand, which again kind of relates back to you when you have other people that kind of ramp up to the projects, it's a lot harder to understand exactly what's going on.
So when we think about naming our variables, it adds intent to the code and it makes it a lot cleaner. and what this does is when we're actually thinking about coding and we take the extra time to really think about how should I name this thing, it really forces us to kind of separate in our own head and really understand how many different things do I need to separate this into and kind of what do they connect to. And I'll touch upon this later but obviously there's a lot of other things that we need to think about on a day to day basis that really impact this. And since we're always under a time crunch, we’re like hey we don't have time to do this. But when we don't actually think about these things, then the code bases that we create kind of lack that same focus and vision, and it makes it a lot harder for other people to understand.
And a good example of this is one of the things that I've personally looked up in my past and I know I've got questions about it. It was like what type of loop should I be using in my code? And if you were to go to Mozilla because I already knew where to look at all the different performance changes. Like oh this one's like ten percent more efficient, I should really be using this one over this, but in this case I shouldn't. But in reality it doesn't matter at all because the V8 engine itself really just handles all of those optimizations for us. And when we think about doing some of the more performance centric things in our code instead of just focusing on what's going to be better for the person after you understands, then it's actually not saving as much time as we think.
And this is a specific example of what a compiler does to our code. So as you can see on the left, we have two kind of statements. One of those is that we're multiplying b*c+g, and the other one is we're multiplying b*c*e. And as you can tell there's some shared code between the two and that's b*c. So what we would have the inclination to do normally is be like oh, the code is actually going to be calculated twice. We're going to run b*c twice. So that's inefficient, we have to solve that. but in reality we could do that ourselves which is fine. But the compiled version which is what I was talking about, V8 actually automatically does that for us. It has a lot of heuristics like this that's really intelligent; that just kind of makes all these optimizations for us.
Something else readability wise, which is important, is types. And what types do is they actually declare exactly what data you're looking at. So before when I was mentioning strings and Booleans, it makes it very clear when you're looking at the code the first time exactly what type you're looking at. And that's important because you don't have to actually dig too far into the code itself to have a solid understanding of what's going on. So these are kind of examples that we use. These are all pulled… These are pretty generic but these are specifically pulled from TypeScript which is the language that we use a lot on the frontends. And what this does, it provides you a lot more context with exactly how the application is working which allows you to form a mental model about how things work a lot faster than normally.
And what it also does is it establishes contracts between the methods and modules that we're creating. And the reason that's important is similar to what I was talking about before where when you add types to a system, it really kind of makes you think about exactly how you're programming it. And the retort to this, sometimes, is that it slows you down, which is true obviously, since you're adding more lines of code you're taking more time to think about these things. I've personally come down rabbit holes where I've thought about naming things for like five minutes before, which by all intents and purposes is usually very frowned upon within the industry, where people think you're wasting your time and you're not actually helping with the ecosystem after all. But I think in my opinion, at least, that's actually a lot more medium-term and long-term oriented, which makes it - so I'll touch upon this later in terms of documentation - so your code’s a lot more self-documenting. When your code is self-documenting that kind of makes it more of like a living and breathing organism that you could kind of understand.
So this is a specific example of very explicit and type heavy code. So what it kind of allows us to do is to get like very quick limps of exactly what the code is doing. I actually modified this today because it actually wasn't as readable. I just said get tags that fit. So instead it just makes it a lot more explicit in terms of its nature. And it's not that helpful for you when you're writing it because you already understand what it does but it's always helpful for the person after you. And sometimes the person after you is yourself looking at the code a year later when there's bugs in it.
So how does all this actually relate to our understanding? So this, the tickler, is as you guys know it's tickling; where there's like one definition we all have for that. But this is actually a method that we came across yesterday in code. And even though it's like a bit funny to look at and really not how to get to trying to figure out what it did, it’s really bad because as I was saying it has a single denotation, and that denotation really has absolutely nothing to do with development itself. There's kind of no ecosystem that really exists that kind of helps us understand exactly what this does. And because of that we had to go actually look at the code itself, spend an extra 5-10 minutes to understand.
So as you could tell that extra five minutes that somebody before us didn't really take to kind of really create a good name for this caused the developers afterward another five to ten minutes. Now as you could tell, one person creates the code and normally the lifespan of a code is many years. so you have more and more people look at it, and every single time they do they lose that same amount of time. So I could probably guarantee you that this tickler code won't shame anybody, but it's in a really large code base from a large organization that a lot of other developers have touched. And I am very positive that every other person has run into the same situation. So it has definitely cost hours and hours of developer time which is silly because it's such a silly name, but it's actually true about kind of what happens when we don’t name things correctly.
So these are these are lines and circles but they correspond to basically a graph. And we can kind of represent these abstractly about multiple touch points within our application. So obviously I’m just going to look at it by itself, I’m not going to think about anything. But when we write code and when we try to kind of create modular ecosystems on the frontend we have to kind of create structures for those and that adhere to those structures. And when we do that then we actually kind of have a unified vision within how every single developer kind of approaches a specific problem.
So a good example of this, if you guys want to do some research, is what are called observables. and that is basically a very opinionated coding paradigm that multiple people follow. And when you do that, that knowledge is basically transferable, which means that many people share the similar mental model and when they come across problems that are similar to this they understand it.
So the question is - in a tickler example for example, how much do we actually have to understand these touch points in the application? When from no study what's better for my own personal anecdote experience, whenever we have problems we need to solve we're usually touching at least six to seven parts of the application, which are usually either files or methods or functions. And all of those kind of has their own unique sets of things that they're related to. So how deep down the rabbit hole do you normally have to go before you actually solve the problem? And ideally you'd go as shallow as you need to before you actually understand what's going on. And the easiest way to do that is to make the variable names that you have in your application actually really explicit and really correspond to exactly what is going on.
So in our example here, the tickler, is its own kind of touch point in this application. And when we actually have to try to understand it, we might have to go to its nodes to really understand what’s going on. And that could have been solved a lot easier by just changing the name. So all in the same lines, the more information you have to retain in a single point in time the more [inaudible 0:18:27] it is. I'm sure you guys have heard of the word flow especially in relation programming where somebody codes for like an hour straight and they’re getting in the flow with things, and somebody taps them in the shoulder, they’ve got a question and all of a sudden, you're gone. And that's because you're thinking about complex things like these, and if they're not very obvious and not hard ends, there's a lot of them and it's hard to retain. Then, you're going to lose it real easily and that's a serious problem because that's a serious problem for efficiency.
So how do we account for this burden? One of the ways to do that is to create documentation. And creating documentation comes in a few different forms. One of those is like basically you could sell the handbook and say this is how the product works and this is our coding philosophy.
Another way you could do this is on the individual methods and functions. You could just kind of create documentation above it that says this method takes in a name and it takes in a color and it does this thing. But the problem with the documentation is that it goes stale. And when documentation goes stale it's bad because then it’s even… Because then we kind of have a mismatch between what the codes actually doing and what the documentation says its doing. And the solution of that obviously is okay well update the documentation more often but often we don't really have too much time for that. So as I was mentioning before, if we focus more on the readability of our code then it'll be more self-documenting which kind of solves this problem.
The last thing which obviously plays into all this is that as developers we’re constantly under a time crunch. We have a lot of deliverables we need to meet. We have a lot of considerations we need to do. And these are obviously some of them where we deal with a lot of responsive web design. There's testing, there's accessibility. We need to think about performance and managing state and managing our edge cases and is the application secure. So as you can tell readability is just a single point on the numerous things we need to think about. But what readability kind of allows us to do is prevent the necessity of tribal knowledge.
And tribal knowledge, in general, is when something is just passed down generation to generation, and in our case when it's just passed down from developer to developer, we have to kind of ramp up and create a working knowledge of the ecosystems. That's why typically when we talk about velocity or velocity slower for a while when people are at companies for like four years, that's sometimes how long it takes to really get an understanding of how things work. And even though sometimes that might just be a symptom of the size and there's not much you could do about it, there are methods to kind of really account for the complexity of systems and be really thoughtful about how we set those up and how we kind of name them.
And that kind of relates back to… Sometimes it’s totally okay to go slower because it will actually make you go faster. And when you create more intent for code and you kind of prioritize readability and not just the short term, then the kind of paradigms you create and the code that you create will be a lot easier for people after you to understand. And which kind of relates to the paradigm which I really live by which is code for the person after you and not yourself.