Code Quality II: Metrics
Key indicators to measure the health state of your codebase
Previously we mentioned what practices we can adopt to make our working process more robust and trustworthy. This time we are going to discuss what things we can keep track of and actually measure in our codebase to have a better, more meaningful understanding of its health state.
When a developer modifies his own code over and over again in a short period of time, it might be a sign that something is not going well in the workflow. Refactoring is totally fine, recommended and sometimes inevitable. But having to rewrite the same piece of code time and again might be an indicator of unclear requirements, poor communication, unqualified developers or even under-challenged ones.
Also, files that have a high level of change are more prone to be sources of bugs especially if the quality of its code is rather poor. So keep an eye on churn and if you want to know more check out this great post about it.
* Images inspired by this great post
Reading code should be as simple as reading a recipe to bake a cake. Both are a set of statements to be executed in a certain manner, taking into account some constraints. Somewhere along the line in a cake recipe you might see things like “beat the eggs until…” or “if you have a slow oven cook it for 30 more minutes”. These are flow control structures, what we know in code as “loops” and “ifs”.
What if I the recipe read something like “beat the eggs for 5 minutes but before that, turn on the oven and if you have a small oven and a big mold or if your mold is medium size but still fits in the fridge, and the dow is ready, then put the dow in the mold, if not, use two molds unless one has the wrong shape”? Hmmm… that is quite confusing. You see, the complexity to read and understand that is pretty high, which can lead you to make many mistakes and potentially have some serious accidents, and the same goes for your code. It should be written to be ordered and meaningful.
Cognitive complexity measures how difficult a unit of code is to intuitively understand. So whatever makes you stop and read again, represents cognitive complexity and should be minimized. Some of the elements that add complexity are:
This does not mean that loops, nested structures or recursive functions are forbidden, that makes no sense, but we should keep in mind that if used the wrong way they make our code unclear.
Cyclomatic complexity is defined as the count of the linearly independent paths in the code execution flow or, in other words, the number of decisions a given block of code needs to make. There is a well established mathematical model developed in the 70’s by Thomas J. McCabe to back this up, which we are not going to cover. Rather, we are going to give two reasons why it is important to keep this number down.
First, having a high cyclomatic complexity will make your code difficult to follow because of all the different flows you will need to understand, which in turn raises the level of cognitive complexity. And we already stated it is a bad thing to have.
Secondly, it determines how difficult your code will be to test, since you test each linearly independent path. The higher the cyclomatic complexity, the bigger the suite of tests needs to be to cover the same code. Moreover, modifying that piece of code can potentially break a lot of tests, making their maintenance harder.
Cyclomatic complexity index on its own won’t give you enough information to determine the health state of your codebase, but by neglecting it you might end up with entangled code and a large and difficult to maintain set of tests.
There are a lot of things you can measure in your codebase, and none of these metrics comes without criticism. Cyclomatic complexity is an aged and well established concept, but many times fails to actually give a proper understanding of what is going on beyond the scope of a method. Cognitive complexity fails to have a proper universally accepted mathematical model to back it up, so it is more of a soft concept. Nonetheless being aware of these topics helps writing and maintaining a cleaner codebase.
Many more metrics are worth investigating and it may very much depend on your project and even on the technology stack you are working with. Measure whatever is most important to you.