Revelry

AI-Driven Custom Software Development

Code comment debate

The Code Comment Revolution Will Not Be Streamed

I’m a little preoccupied with code comments. I find the concepts, strong opinions and philosophical underpinnings fascinating. I have had many arguments conversations with other developers over my career about code comments, but until recently, there was always a definitive dead end to those discussions; too much overhead for a nebulous benefit. Now however, AI gives us a new opportunity to spark the code comment debate and view code comments, not as cruft, or mess, but as a DX enhancement with first class support.

We can define a code comment in a couple of ways. The most basic way is to say that code comments are natural language text, embedded into a computer program file, which are ignored by the program compilation and/or execution.

// This is a comment in natural language, it will be ignored during execution
console.log("This is also natural language, but will be logged to the console")

Most programming languages (except the esoteric or specialized ones) have the ability to comment code. Some allow inline, some don’t. Some use # and some use //. There are little differences, but the ability to insert natural language text, ignored by the compilation or execution of a program is near universal. Despite the fact that most programming languages have them and that they do not affect code execution, code comments are a bit of a taboo in my experience. I’ve commonly seen PRs marked up with “unnecessary comment” feedback. I bristle at that type of comment because, “unnecessary” is relative.


Here is where we come to the debate surrounding code comments; when and why a developer should write them. There’s a spectrum of argument with one end being, “You should never ever write code comments” and the other and far less popular, “you should comment every single line of code”. Let’s examine the perspectives of this debate, starting with the extreme affirmative position.

Why you should comment every single line of code

In a talk I heard years ago by Andy Harris, he said something that shook my young(ish) developer mind. “Code is there to explain the comments to the computer”. Unpacking that a little bit, programs are logical instructions that we humans think of in natural language and then we translate those logical instructions into something the computer can execute. So entire programs can be written in plain English, because programs are just instructions. In order for a computer to execute the code, it needs to be translated into something machines can understand, which means our job as programmers is to take natural language and translate it. 

With that in mind, doesn’t it make sense that we would write our full program in english and then translate it once we’ve figured out what we actually want to do? This approach is called pseudo coding. Writing the algorithm in natural language, and then translating it into code.

// declare a function which takes an array to search through, and an id to search by 
    // Loop over array, start at the front, stop at the end, increment by 1
        // if the id of the array at the current index matches our target
            // break out of the loop and return the matching element
    // no match found, return null object

Above: Pseudo Code.

Here is an example of pseudo code intermingled with the implementation.

// declare a function which takes an array to search through, and an id to search by 
function findById(array, id) {
    // Loop over array, start at the front, stop at the end, increment by 1
    for (let i = 0; i < array.length; i++) {
        // if the id of the array at the current index matches our target
        if (array[i].id === id) {
            // break out of the loop and return the matching element
            return array[i];
        }
    }
    // no match found, return null object
    return null;
}

The benefits of something like this are essentially the same as something like typescript (or maybe they could be?). We can compare against the algorithm that has been implemented (code) vs the one that we planned (comments) and look for discrepancies. This can point out bugs, or needless complexity because humans process things in natural language, not machine code. Using pseudo code doesn’t absolutely mean keeping the comments used to figure out an algorithm, but the benefit of keeping them is the trail of intention, and the ability of anyone to come along later and read what the plan was, and maybe why it was.

There’s also this question of readability that comes up in programming. “Is this readable” is a question that’s hard to answer in broad, but comments can provide a readability smell. If we’re having trouble writing a comment that explains our solution and how it works, then maybe the program is more complicated than it needs to be. Whether I’m writing the code first or the comments, (dare I say, Comment Driven Development™®©) there are benefits in giving the logic a second pass and reading it another way.

The comments can provide a snapshot of the business requirements, intention, or thought process of a given area of code. This enhances the readability of code because now you have logic to compare against. Code may seem obviously wrong if you’re not sure of what logic was trying to be implemented.

What’s more, anyone who can read, can read a comment regardless of their developer skill level. Less experienced developers can come along and understand what is being done (or intended) in a given program, without any special knowledge of whatever language the code is written in.

So the benefits of Commenting every single line of code are:

  1. More closely models human thought
  2. Can point out flaws in intention vs implementation
  3. Can document intention, which might otherwise be lost.
  4. Assumes no programming skill level (theoretically)

Why you should never comment your code

Now moving on to the other end of the spectrum in which you should never* comment your code. That last sentence has an asterisk because I don’t think there actually are developers who would argue that you should never comment your code without exception. There will always be that tricky little hack that needed to happen and needs a good comment because of a quirk of some third party library. That being said, the “seldom use” case represents enough of an extreme for me to highlight why developers may feel this way.

A person who seldom uses code comments might say that the code should be readable enough, in syntax, flow, and variable/function naming, such that you could read it like English.

const addTwoNumbers = (number1, number2) => number1 + number2
const sumOfTwoNumbers = addTwoNumbers(2, 2)

This is perfectly readable to me. If we wanted we could enhance this with typescript as well.

const addTwoNumbers = (number1: number, number2: number) => number1 + number2
const sumOfTwoNumbers = addTwoNumbers(2, 2)

This will give us some type safety without going overboard and typing every possible value in the above expression. I bring up typescript because one of the reasons that comments get pooh-poohed is the additional overhead they require. Comments must be written, read, and maintained. Writing a comment for every line of code means you’d be writing twice as many lines per ticket. How can developers justify that when there is no clear benefit. Not to mention, COMMENTS LIE.

Comments fall out of sync all the time when a bug gets fixed or a refactor occurs and now the comment is worse than useless, it is actively deceptive about what is happening.

We may start with this example

// Function to determine the full shipping cost.
// NOTE: We only check the 'country' property, as shipping is free within the EU.
function calculateShipping(order) {
  if (order.country === 'USA') {
    return 15.00;
  } else if (order.country === 'Canada') {
    return 10.00;
  } else {
    // If not US or Canada, assume EU for simplicity (free shipping).
    return 0.00;
  }
}

And wind up with this

// Function to determine the full shipping cost.
// NOTE: We only check the 'country' property, as shipping is free within the EU. <-- MISLEADING COMMENT
function calculateShipping(order) {
  if (order.isInternational === true) {
    // International orders now use a flat rate.
    return 20.00;
  } else {
    // Domestic orders (not international) are free.
    return 0.00;
  }
}

This type of thing is fairly common, and, after all, who cares. Your comments can exclusively be pieces of praise for you (You’re so smart, I love you), and it will not have an effect on your code at compile time or run time. So a person who eschews comments would argue there’s no point and it can be worse than pointless if you actually try to rely on some out of date comments to understand code. 

They may even argue that your need for commenting tricky code is a code smell. Needing to explain yourself in natural language, likely means your solution needs to be modified to be more straightforward or simple. I understand that idea, not to mention that natural language communication is its own kind of hard. Who wants to read a bunch of  probably unclear or meandering comments of bizarre, way too complicated code.

These arguments about maintainability, deception, and noise are very reasonable and valid. In the past I’ve argued that those are all tradeoffs you would experience when using typescript. You have to add and maintain types, types may be lying (any), and static typing can provide a lot of noise, and point of fact, typescript provides no runtime safety so it is as useful in running code as comments are. It’s just a developer experience enhancement.

That’s where the argument for code comments being as useful as typescript really falls apart though. Typescript has a compiler that points out issues, and often has a language server to provide feedback directly in the editor as a program is being written. There’s no such thing for comments. No red squiggly line under comments saying “hey this doesn’t match the implementation”. 

So the drawbacks to commenting every single line of code are:

  1. Overhead! 100% Overhead!
  2. Code becomes less readable.
  3. Comments might not reflect the reality of the code.
  4. Dubious (if any) benefit.

Up until this point, this is essentially where all conversations I’ve had about code comments have ended. Now, however, we have a new opportunity with AI. 

AI and the code comment debate

My experience working with AI to develop code has been largely positive, it’s not perfect but neither am I, and the ability of LLMs to write code based on plain english instructions has impressed me. This has me thinking about what I was describing above with pseudocode.

I write the natural language and that gets converted into machine readable instructions. This raises an important question : if no programs had comments, would LLMs even be able to translate natural language instructions into code? I suspect no, which would mean even if a developer finds a comment like, initialize x to 0, not particularly helpful, keeping that in with the code can help train models, and improve them. And developers now have an added incentive to keep the comments and code aligned. And for the first time, this is possible in an easy, DX first way.

AI could read your comments as you write code and remind you that the corresponding comment needs to be updated. It could provide sentiment analysis on the comments you’re writing and let you know if you’re being incoherent or if your solution sounds overly complex. AI could address all of the problems people have with comments.

Maybe AI and comments could prevent us from having to learn programming languages at all. Maybe we could use natural language comments as instructions embedded in a code file and compile it to working code without even knowing the implementation details at all. There’s a big spectrum of options in how AI paired with comments could enhance the developer experience in ways we’ve only really imagined up until now.

Since the introduction of AI agents and chatbots in code editors and terminals, we can finally have real time, first class support for comparing comments to the algorithm they reference and show realtime feedback as developers are coding/commenting. My dream of red squigglies from inaccurate or misleading code comments can finally be realized! And this will slot nicely with AI generated code, which is famous for commenting every line already.

Some side benefits I can think of include, enhancing the technical writing/communication skills of software developers, keeping constantly updating documentation on every line in every file, the ability for non technical people to read and review algorithms that have been checked by the code comment linting step of a build pipeline, and better comments coming out of future models. If developers stop fighting writing, and embrace it in the face, they can provide better data to train future models, and improve their own use of AI in the process, because when we use AI, we use natural language, not code.