August 2021, Epsilon Clue

Hacking Things I've Learned

Readable Code: Variable Overload

It’s well known that reading code is a lot harder than writing it. But I recently got some insight as to why that is.

I was debugging someone else’s sh script. This one seemed harder to read than most. There was a section that involved figuring out a bunch of dates associated with a particular object. The author was careful to show their work, and not just have a bunch of opaque calculations in the code. I won’t quote their code, but imagine something like:

NOW=$(date +%s)
THING_CREATION_EPOCH=$(<get $THING creation time, in Unix time format>)
THING_AGE_EPOCH=$(( $NOW - $THING_CREATION_EPOCH ))
THING_AGE_DAYS=$(( $THING_AGE_EPOCH / 86400 ))

Now imagine this for three or four aspects of $THING, like last modification time, which other elements use $THING, things like that. The precise details don’t matter.

Each variable makes sense. But there are four of them, just for the thing’s age since creation. And if you’re not as intimately familiar with it as someone who just wrote this code, that means you have to keep track of four variables in your head, and that gets very difficult very quickly.

Part of the problem is that it’s unclear which variables will be needed further down (or even above, in functions that use global variables), so you have to hold on to these variables; you can’t mentally let them fall. Compare this to something like

<Initialize some stuff>
for i in $LIST_OF_THINGS; do
    ProcessStuff $i
done
<Finalize some stuff>

Here, you can be reasonably sure that $i won’t be used outside the loop. Once you get past done, you can let it go. Yes, it still exists, and it’s not illegal to use $i in the finalization code, but a well-meaning author won’t do that.

Which leads to a lesson to be learned from this: limit the number of variables that are used in any given chunk of code. You don’t want to have to remember some variable five pages away. To do this, try to break your code down into independent modules. These can be functions, classes, even just paragraph-sized chunks in some function. Ideally, you want your reader to be able to close the chapter, forget most of the details, and move on to the next bit.

In the same vein, this illustrates one reason global variables are frowned upon: they can potentially be accessed from anywhere in the code, which means that they never really disappear and can’t be completely ignored.

Andrew Arensburger

Aug, Thu, 2021

Epsilon Clue

Epsilon Clue

Archives August 2021

Readable Code: Variable Overload