The Life-Changing Magic of Tidying Your Code

First published in php[architect] May 2018

When I made the transition from hobbyist to professional programmer I discovered that it isn’t enough for my code to work, it also has to be easy to maintain. At first I wasn’t sure how to make that happen. Then, when I read Robert Martin’s book Clean Code, I discovered that good code is written with people in mind, follows a consistent format, has short functions, uses meaningful names, and avoids comments. Applying these principles has dramatically improved my code and can change your life as well.

Introduction

I tracked the bug back to a 209-line function. At least, I thought I did. Somewhere around line 167 it calls an 89-line function, which looks like it might be the real culprit. But I’m struggling to understand what this code does, so maybe I’m wrong.

It’s overwhelming to work in code like this. The functions are so long, the variables so similar, the logic so obscured that just figuring out what is going on takes forever, and deciding how to update the logic takes even longer.

When I follow a bug into this kind of quagmire, my heart sinks, because programming doesn’t have to be this hard. There are ways to make complex code easier to understand and modify.

Clean Code

"… spending time keeping your code clean is not just cost effective; it’s a matter of professional survival.” - Robert Martin in Clean Code

I’ve written my fair share of dense, stream of conscience code. The resulting programs worked, but I would cringe every time I had to modify them. The only way to change the code was to start at the top of the long function and read it line by line until I understood where I needed to be, and that was not a pleasant prospect.

I knew code like this was hard to work with, but I didn’t know how to write anything easier to support.

Fortunately I’m not the first one who has struggled with this problem. In his book Clean Code Robert Martin collaborated with a number of experienced developers to develop guidelines for writing code that’s easier to maintain. The result is a book filled with practical advice from folks who have been writing software for decades.

Reading Martin’s book completely changed how I write code. It helped me see the traps I kept falling into, and charted a path for writing clean, maintainable code.

And, like any new convert, I want to share these insights with you. Learning to keep your code tidy is a bit of magic that will change your life.

Code Is For People

“The next time you write a line of code, remember you are an author, writing for readers who will judge your effort.” - Robert Martin in Clean Code

At the heart of clean code is one fundamental principle - code is written for people. To some this will be obvious, but to others it’s counter-intuitive.

When we first learn to program we focus on communicating with the computer. We aren’t concerned if other people can decipher our code, because at this point we don’t fully understand the commands ourselves. Our main goal is simply to make the program work.

Unfortunately, some of us never get beyond this perspective. We continue to see code primarily as a way of communicating with the computer. As a result, we end up writing code that’s hard for people to read and maintain. Our variables are often meaningless, we use non-standard formatting, the functions are too long, or we revel in clever code that is as sparse as possible. We have learned to communicate with the computer but not to communicate with people.

To start writing better code we have to move beyond writing just for the computer. After all, as long as your code is valid, the computer doesn’t care how many lines you used, what you named your variables, or how clever your code is. But other people will care.

Martin notes that “the ratio of time spent reading vs writing code is well over 10:1. We are constantly reading old code as part of the effort to write new code.” As I think back through my time programming, that observation rings true.

For example, yesterday I spent about three hours tracking down a pesky little bug. I read through one method after another, jumping from class to class trying to discover why we weren’t getting the results we expected. After all that time reading and debugging, I wrote only one line to fix the actual bug.

So, when we write code, our most important audience isn’t the computer, it’s the guy who is trying to understand what he needs to change in our code. He’s the one who has to decipher the program and figure out what we were thinking. He’s the one who will need to expand or modify what we have written. And, unlike the computer, he cares a lot about how easy that code is to read.

The foundation of clean code is making your code simple for other programmers to read. Anything that helps them understand your code is a good thing, and any obstacle that hinders their understanding should be avoided.

Our goal is to make our code as clear as possible to other developers.

Formatting

“Code formatting is about communication, and communication is the professional developer’s first order of business.” - Robert Martin in Clean Code

The first step in making code easier for others to read is to use a standard formatting style.

When you read something in English you expect the text to conform to certain conventions. Paragraphs are separated by an empty line. There is one space after each word. Each sentence begins with a capital letter and ends with a punctuation mark.

We take these conventions for granted, but these rules allow us to read text quickly. Ho w e v e r; IF the author broke with stAndArd conventions. you Could stillReadTheSentence but it, would? be – haRdeR

The same is true with programming. Programs are easier to read when they follow a predictable format. For example, take a moment to look through this bit of code.

public function daysSinceLastUpdate($dateCreated, $dateModified)
{$today = date_create();
if ($today === $dateCreated) return 0;
if (empty($dateModified) || $dateCreated > $dateModified) {$latestDate = $dateCreated;}
else {$latestDate = $dateModified;} return date_diff($today, $latestDate);}

It’s valid PHP, but you have to concentrate when you read it. You can’t just glance at it and see how many returns there are or whether the second if statement is nested in the first one. Instead, you have to carefully read each line and mentally parse the code as you go. Reading this code is hard work.

Now consider the same code when it follows the PSR-2 coding style.

public function daysSinceLastUpdate($dateCreated, $dateModified)
{
    $today = date_create();
    if ($today === $dateCreated) {
        return 0;
    }    
    if (empty($dateModified) || $dateCreated > $dateModified) {
        $latestDate = $dateCreated;
    } else {
        $latestDate = $dateModified;
    }
    return date_diff($today, $latestDate);
}

The content of this code is exactly the same as the earlier sample, the only difference is the formatting. But since this code follows a standard style, you recognize immediately what it does. You can scan it from top to bottom in seconds, quickly identify that it has three possible return values, and clearly understand that the if functions are independent instead of nested. You don’t have to think when reading this code, you can just read it.

Using a standard format for your code is just as important as using the standard format when writing English.

I don’t always agree with my teammates about how our code should be formatted. For example, I prefer to have logical operators at the beginning of lines, like so.

if ($myWay === $isBetter
    && $theirWay !== $nearlyAsGood) {

While some of my teammates prefer to have the logical operators at the end of the lines.

if ($myWay !== $isBetter &&
    $theirWay === $soMuchBetter) {

We can both make reasonable arguments why our layout is better, but honestly it comes down to personal preference. In the end, where we agree to put the punctuation is not nearly as important as the fact that we all know where to look for it.

So, style guides aren’t about everyone agreeing on what is best, it’s about everyone agreeing to do the same thing. If your team has a style guide, even if you don’t like all of the recommendations, it’s important to use it. The consistency in style improves your code’s readability.

If your team doesn’t have a style guide, then at the very least use the PSR-1 Basic Coding Standard and the PSR-2 Coding Style Guide. These are basic formatting rules developed in the PHP community, and the more we all follow them, the easier it is for us to read each others’ work.

Even if you disagree with some of the recommendations of these guides, these standards are vital to helping us communicate with each other. By agreeing on how the code will look, we make it easier for others to read, and that’s what is most important.

Functions

“The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.” - Robert Martin in Clean Code

Once you have your code following a consistent format, the most important thing you can focus on is keeping your functions small.

If you’ve been writing code for any length of time, then you can relate to being lost in a bottomless function trying to figure out what’s going on. The problem isn’t that the original programmer didn’t think things through clearly, it’s that they made a single function do everything. So, in order to make a change, you have to walk through the code from the beginning to figure out where the logic needs to change.

With short, single-purpose functions, it’s easier to find the specific code you need. And with the logic all contained in a small function, it’s so much easier to know you are making the right changes in the right place.

So, how small should a function be? The question isn’t as much about line count as it is about purpose. A function should do only one thing, and nothing more. If the function does more than one thing, it’s too big.

Look at the function below. How many things does it do?

  public function postNewMessage(User $user, $targetUserAddress, $messageText)
{
    $valid = true;

    if (empty($messageText)) {
        $valid = false;
    }

    $address = new Address();
    if (!$address->exists($targetUserAddress)) {
        $valid = false;
    }

    if ($valid) {
        $message .= 'Posted By: ' . $user->getName();
        $message .= 'Posted On: ' . date('l, F j'));
        $message .= $messageText;
        
        $post = new Post();
        $post->postMessage($targetUserAddress, $message);

        $log = new EventLog();
        $log->setDate(date('Y-m-d');
        $log->setUser($user);
        $log->setAction("Posted: " . $message);
        $log->save();
    } 
}

First of all, did you actually read the code? It’s not a terribly long function, less than 30 lines, but my first inclination is to skim it. It seems like reading it would require too much mental energy. If you don’t even want to read a function, that’s your first sign that it’s too long.

Once you do read it, though, it appears to do one main thing - it posts some kind of message. But look closer and you’ll see that it’s actually doing four things.

  1. Validates parameters to make sure the message can be posted
  2. Composes the message to post
  3. Posts the message
  4. Logs the message posting

These actions collaborate toward a common goal, so it may not seem like a problem to lump them into a single function. But functions are like weeds. If you aren’t diligent with them, they will grow wild.

This function already feels a bit long, and you have to read it carefully to get the big picture of what it’s doing. So what happens when you have to add more validations, or sanitize the inputs, or expand the logging? Eventually you’ll end up with a 209-line function that you can barely understand.

So, let’s try moving each action into its own function. Our goal is that each function does only one thing, and that the code is easier for other programmers to read.

public function postNewMessage(User $user, $targetUserAddress, $messageText)
{
    $postIsValid = $this->validateNewPost($targetUserAddress, $messageText);
    if ($postIsValid) {
        $message = $this->composeMessage($user, $messageText);
        $this->postMessage($targetUserAddress, $message);
        $this->logPosting($user, $message);
    }
}

private function validateNewPost($targetUserAddress, $messageText)
{
    $valid= true;
   
    if (empty($messageText)) {
        $valid = false;
    }

    $address = new Address();
    if (!$address->exists($targetUserAddress)) {
        $valid = false;
    }

    return $valid;
}

private function composeMessageText(User $user, $messageText) 
{
    $message = 'Posted By: ' . $user->getName();
    $message .= 'Posted On: ' . date('l, F j');
    $message .= $messageText;
    return $message;
}

private function postMessage($targetUserAddress, $message)
{
    $post = new Post();
    $post->postMessage($targetUserAddress, $message);
}

private function logPosting(User $user, $message)
{
    $log = new EventLog();
    $log->setDate(date('Y-m-d');
    $log->setUser($user);
    $log->setAction("Posted: " . $message);
    $log->save();
}

We now have the same logic broken into five small functions, and while there are a lot of functions to read, each seems trivial enough by itself. The code takes up more vertical space, but that’s a small price to pay for what we have gained.

First, notice how the first function reads like a story. We validateNewPost(), composeMessage(), postMessage(), then logPosting(). When we had all the code stuffed into a single function, it took a lot of reading to recognize the main activities it was performing. But now, with all of the details extracted into their own methods, the logic in the main function is crystal clear.

Next notice how easy it is to read each of the smaller functions. The logPosting() function only deals with the logging, while composeMessage() encapsulates all of the message composition details. Each function does only one thing, and each handles its own particular level of abstraction.

Finally, think about having to update this code. How hard would it be to add a new validation requiring that the message be less than 140 characters? With the validation code all contained in validateNewPost(), we know right where it goes. Similarly, if the message composition needed to be updated, you know right where to find it.

This is what it looks like to break your code into small functions. It takes a little while to get the hang of it, but once you do it’s a powerful tool.

For example, I once had to update a 600-line method that was doing dozens of different things. Without making significant modifications to the code, I broke the method into a number small functions. After that it wasn’t only easier to update the method, but I also found some duplications and misplaced logic that I could move out of the function.

Using small functions makes a huge difference.

How do I know if my function is too big?

It takes a while to figure out how small a function should be. Some folks suggest it should only have as many lines as you can count on your fingers, but it’s a bit trickier than that. So, here are a couple other indications that the function has gotten out of hand.

  • If you’re wondering if the function is too big, it probably is. Err on the side of having smaller functions.

  • If you’re deeply nested in if or for statements, your function is too big.

public function tooBigForYourOwnGood()
{
    if ($somethingIsTrue) {
        ...
        if ($somethingElseIsTrue) {
            ...
            foreach ($gettingTooDeep as $whoaThere) {
                ...
                if ($wayPastDeep) {
                    // Your function is way too big
                }
            }
        }
    }
}
  • If your parameters include booleans, your function probably does more than one thing and can be broken into two separate functions. For example, the function below could be separated into updateNiceUsers() and updateAnnoyingUsers().
public function updateUser($userIsNice = false)
{
    if ($userIsNice) {
        // do one thing
    } else {
        // do something different
    }
}
  • If you are getting ready to copy and paste a bit of code from one method to another, what you’ve copied should be a separate function. After all, you’ve already realized you need the logic in more than one place.

It takes a little while to get the hang of writing in small, single-purpose functions, but once you do, your code becomes dramatically easier to read and maintain.

Meaningful Names

“Choosing good names takes time but saves more than it takes. So take care with your names and change them when you find better ones.” - Robert Martin in Clean Code

In the code below, see if you can tell what the function does and what values should be passed in as parameters:

public function tci($p, $r, $n, $t) 
{
    $a = $p * (1 + ($r/$n))^($n*$t);
    $ci = $a - $p;
    return $ci;
}

This function is short, does only one thing, and is correctly formatted. But aside from recognizing that it does some math calculation, it’s hard to understand its purpose.

The problem with this function is that the developer chose variable names that work for the computer, instead of selecting names that would communicate with other people. If we are going to write code for other programmers to read, then we must choose names that make our intentions obvious.

A name should clearly describe whatever it is naming. If it’s a function, then the name should be a verb that tells the programmer what the function does. If it’s a class, make the name a noun that identifies the object. If it’s a variable, then the name should make clear what value that variable holds.

When selecting names, give enough details in the name that the value or function becomes obvious. If, after naming your variable, you feel like you need a comment to explain it, then you’ve picked a bad name. Keep trying until you’ve gotten a name that describes your intent without needing a comment.

And don’t be afraid to use long names. It’s much better to know that a variable holds $timeInSeconds than just knowing that it has some form of $time.

Now, let’s look at the same function again, but this time use variable names that have some meaning.

public function getCompoundInterest(float $principal, float $rate, int $compoundingPeriod, $time)
{
    $accruedInterest = $principal * (1 + ($rate/$compoundingPeriod)^($compoundingPeriod * $time))
    $compoundInterest = $accruedInterest - $principal;
    return $compoundInterest;
}

This function does exactly the same calculation as the first. It took a little longer to write, takes up a bit more screen space, but is now completely understandable. If our goal is to communicate clearly with other developers, then that little bit of extra time and space is well worth the investment.

For example, I was recently trying to decipher a long function that was doing something with a $doc, a $file, and a $fileDoc. I didn’t understand what the method did because I couldn’t distinguish between these variables. However, once I renamed the variables it became clear that the $originalFile was being replaced by an $updatedFile under certain circumstances, and then registered in the database as a $documentEntry.

The names you choose determine how easy it is for the next programmer to maintain and update your code. Be kind, and make your names as descriptive and obvious as possible.

Comments

"… the only truly good comment is the comment you found a way not to write.” - Robert Martin in Clean Code

If our primary goal as professional programmers is to communicate clearly, then it may seem like we should have a lot of comments in our code. After all, aren’t comments the epitome of developer-to-developer communication?

Not exactly. Comments are failures. Comments are lies. Comments are distractions littering our clean code. Comments are superfluous and vapid. Comments are rarely helpful.

How many times have you seen, or written, something like the sample below?

/**
* @param string $emailAddress
* @param int $userId
*/
public function notifyUser(User $user) {

According to the comments, this function takes an $emailAddress and a $userId as parameters. But, according to the code, it actually needs an instance of the User class.

Which one is true? The code is true. And if this comment cannot be trusted, then which comments can be? Comments may lie, but the code always tells the truth.

And that’s the idea behind clean code. Your code should communicate so clearly that comments are unnecessary. The functions should be short enough to read easily, the functions and variables named clearly, and the flow should be obvious. Invest in making your code readable instead of relying on comments to cover for your failings.

And comments often do try to cover our failures. We failed to give the variable a descriptive name, so we wrote a comment explaining what the name should have made obvious. We wrote a function that was too long to digest, so we added comments showing each place where we should have broken the code into a smaller function. We were too scared to delete unneeded code so we commented it out.

And then there are comments like this one:

/**
* Calculates Total Tax
*/
public function calculateTotalTax($price, $taxRate)

There is no reason to say in a comment what you have just as clearly expressed through the function name. But for some reason programmers still put comments like this in their code, not as annotations in a doc block, but as worthless explanatory notes.

If your code is littered with meaningless, outdated, excuse-laden comments, what benefit are they?

Are there ever situations where a comment is a good idea? Well, yes, there are some useful comments. For example, I recently went to update the permissions in a bit of code and discovered a comment pointing out why certain users were prohibited from performing this action. When I mentioned that to the project owner she said, “Oh, yes, good catch!”

So, comments that capture information which cannot be expressed in the code can be useful. But these comments should be rare.

If you must include a comment, be sure it adds value to your code. Otherwise, make your code expressive enough that comments are not necessary.

Keeping It Clean

“We’ve all said we’d go back and clean it up later. Of course, in those days we didn’t know LeBlanc’s law: Later equals never.” - Robert Martin in Clean Code

Finally, make a point of committing only clean code.

We don’t always write in clean code. Sometimes we write in long stream of conscience functions so we can work out what needs to happen. Sometimes we are in a rush to fix a bug, deploy a feature, or placate a frantic manager beating a drum until we get the code updated. In each case we slap out an ugly bit of code and think, “I’ll clean that up later.”

But later never comes. The drum never stops beating, and folks don’t stop expecting new code fast.

It’s okay to write a draft version of the code to figure out what needs to happen, but don’t commit your draft to the project.

When we put dirty code into our projects, the dirt stays. We aren’t going to clean it up any more than we’re going to fix the creaky floor board that’s been squeaking for the past ten years. The problems don’t get fixed, they just accumulate.

So don’t commit any code to your project until it’s clean. Take the time to leave your code better than you found it.

Conclusion

“Learning to write clean code is hard work. It requires more than just the knowledge of principles and patterns.” - Robert Martin in Clean Code

Complex code doesn’t have to be hard to work in.

when it is, it’s because our focus has been in the wrong place. We have thought too much about writing commands for the computer and not enough about writing for the people who will read the code.

If we will organize our code into small, single-purpose functions, follow standard code formats, use meaningful names, and try to make our code so expressive that it doesn’t need comments, then we have taken a huge step toward making life easier for ourselves and our fellow programmers.

There’s a lot more advice about writing good code in Robert Martin’s Clean Code,and I would highly recommend it. Because if you will take the time to learn and practice the advice from these veteran programmers, you too may discover the life changing effects of tidying your code.