Code Hazards and How to Avoid Them

When you work on a large, constantly changing codebase with lots of other engineers, there’s a lot you don’t know about the person who will next change your code. They could be a new hire who isn’t yet familiar with the code standards. They could be an experienced programmer with a different opinion about code design. They could just be your future self, having a long day and prone to making errors. At their lowest common denominator, each of these people is a very fallible human being with a base level of programming understanding and a heap of cognitive biases- I suggest that you write your code for this person.

If you think of each class or set of functions you write as an API, you can anticipate what ways it could be misused or misunderstood and take preventative measures. This is so much quicker than it is for someone else to dig through the source files trying to understand why your function call isn’t returning what they expected. Imagine that the hazards in your code like a dereferenced pointer or a non intuitive return type are like physical hazards in the real world. If you owned public property with a big hole in the middle, you’d put up warning signs and maybe a fence to prevent people from falling in. Code has the additional challenge that people then have the authority (and maybe good reason) to come along and knock your fence down or fill the hole with sulphuric acid, but at least your fence says “people probably shouldn’t be allowed to fall in here”.

At Rare we tend to avoid using comments in the codebase, and very few files have any. This may be a matter of taste, but it definitely has the effect of encouraging you to learn to use all the tools at your disposal to make your code speak for itself. Here I’ll break down different types of “code hazard”, and how we can guard or eliminate them without relying on code comments or documentation.

Expecting particular inputs

Many C++ engineers are very much in the habit of checking the validity of any pointers before dereferencing. The worst case of failing to do so is that some well meaning soul makes a change that calls it with a pointer that is only null 1 out of 1000 times. Voila; you have a low repro crash bug, and no error log to tell you what happened without having to dig through the call stack.

It’s not just null pointers that are the danger here. Assume any sort of unusable input can end up in your functions; out of range indexes, unsupported types from the same inheritance chain or duplicate map entries that will overwrite runtime data.

Argument names

Your first line of defense here is being as clear as possible in your argument names. Names like “DistanceInCm” or “ClampedValue” can preempt some typical errors.

Compile time checks

Compile time protections to actively prevent unexpected inputs can include checks like static casts or asserts where appropriate to signal “something is wrong here”. While a warning message can be easily passed by, a compile error is a much louder noise.

Argument typing

Something as simple as typing your input arguments appropriately can both communicate your intent and prevent misuse. Using an unsigned integer type to represent an array index, for example.

If your ensure that the function can only receive valid inputs, it will force the caller to do any sanitising at the calling site. This will raise any errors in the code the engineer in question is working on right now, rather than deep in your less familiar code.

It may be worth checking the documentation of your language or library for any helpful types that can force clients to give you valid inputs, like F#’s dimension unit types or writing your own use case specific struct that can run checks on itself, like whether an input is within range.

Runtime input validity checks

At runtime, all your inputs should be checked for validity where possible, and you can judge what kind of “nope” a failure here warrants, from a graceful failure of “oh well”, though various log verbosities, all the way up to the crash of “oh heavens no”.

Returning unexpected outputs

If you call a search function on a map and it returns a null pointer, does that mean that it didn’t find an entry for the key you gave it, or did it mean that the process failed? If you call a size function with an integer return type on an uninitialised array will it return -1, or will it simply throw an error? Is it safe to assume the size is accurate and use it to access an entry on the array directly? When the possible outputs and safe usage of a returned variable aren’t explicit, it can be an opportunity for serious errors to sneak into the codebase.

Lifetime management

If a supermarket sold you a box of rotten eggs and some naive soul went ahead and ate one, it’d be considered to be the supermarket’s fault they got ill. Avoid giving the consumers of your API rotten eggs, instead of assuming they’re going to know not to eat them. While some situations can necessitate the use of types of ambiguous lifetime like raw pointers, returning references or lifetime managed structures like smart pointers can ensure that the information that’s requested will always be there.

Return types

The return type of your functions is often an overlooked means of communicating with customers of your API. In the size function case, an int return may say “I am able to return results that don’t make sense”, unsigned int or size_t return says “Don’t call this if I’m not going to be able to give you a valid response”, while a templated type like Unreal’s TOptional<int> would say “This function may not be able to return a valid output as part of normal operation, you should account for this in the calling code”.

Using custom structs

Typedefs or custom local structs are also handy here to be able to return multiple, named outputs from complex operations where it may be ambiguous what the return means. A pointer may be able to separately show validity and a value, but a struct containing the function’s result and an IsValid() function is a little more helpful.

Performing unexpected functions

At a basic level, having a descriptive function name that describes of everything the function does is already a step towards preventing misuse. Using const functions wherever it’s possible is also a good habit to get into, because it makes it more obvious what functions are liable to cause state changes.

It’s a good idea to think about the definitions of the words you’re using and how consistent they are with the codebase or with other common libraries like STL. If the concept is vague, do verbs like “Register” or “Reset” mean the same thing each time they’re used? Getters are a frequent offender here- the common definition of “Get” usually represents a const function returning only a copy of a member variable or data entry and performing no other logic. Deviating from this expectation can set up future code to have unintended side effects.

Requiring unspecified prerequisites

In most languages, there’s no way to indicate that one function should be called before another, mostly because this situation is a fairly avoidable fallacy. By having your functions be reliant on being called in a particular order, a pattern known as temporal coupling, you’re setting consumers of your API up for bugs caused by missing dependencies or unordered processes.

Injecting dependencies on construction

By injecting dependencies on construction, you’re making sure that your objects are always in a valid state. Initialiser functions are a common pattern, but you’re then forcing anyone using your class to remember to initialise first, which is a usually unnecessary opportunity for error.

Checking member variables before use

In Unreal engine, Initialise() functions to inject dependencies are sometimes a necessary evil, due to the way UObject construction works. The simplest safety net in these cases where temporal coupling is unavoidable is giving member variables a sensible default that makes it obvious when it’s uninitialised. Check theses variables before they’re used and on failure throw a well worded error message that tells the client “This variable is invalid because you didn’t initalise first”.

Designing by contract

Designing your API for specific use cases rather than internal functionality is a particular point of evangelism for me. Avoiding this sort of temporal coupling is another great reason to do this. I wrote a system recently that set player background music, usually triggered by players entering and exiting different concentric regions. Originally I had a function that set the background music state directly via a SetPlayerMusicState function, which made it increasingly hard to control as the design for use cases became more elaborate. This had no safety regarding players exiting regions they weren’t already in, or entering inner regions without already being in the outer region- on respawning after death for example. Making this function private and having it only operated by use case functions like PlayerEnteredRegion or DisableMusicForPlayer made it a lot harder to make mistakes in the calling code.

It’s easy to start thinking of foolproofing as a defensive measure, and the future users of your code as antagonistic in some way. Of course, a truly antagonistic engineer would be able to do whatever they could get past review. If someone’s out to deliberately hurt your system, designing your code defensively would be the least of your worries!

Foolproofing is rather a preventative method of reducing inefficiencies in development further down the line by empathising with the next person to work in your system, even if it’s your future self. The methods I’ve outlined are just some of the ways you can anticipate and highlight future errors, or use all the tools available to you to your code communicate. Naming variables well is just the tip of the iceberg for the ways you can make your code helpful for your whole team.

Many thanks to my lovely colleagues Topher Winward, Chantelle Porritt, Sarah Noonan and Dave Allmond for proofreading and suggestions!

MJ Kuchenbrod August 22, 2018 / 11:48 pm

Awesome article (that I just stumbled upon by reading your latest one). Mind if I ask an opinion of you? I find it REALLY interesting that you guys don’t comment your code. Very interesting concept to think about. So do you find it makes things harder? More bugs? Maybe in terms of deliverables (timelines), or having to always take extra time to “figure out the code”? To the points you make, you might come upon three great developers that all think slightly different on how to accomplish a task. So their code could be widely different, which adds to the difficulty of the next person touching it. My team ran into that once with basically a math major writing an algorithm that had our head spinning (he didn’t comment anything)! But it was powerful and worked for what we needed. Appreciate the transparency on that, as it really makes me think about documenting processes or having the code “speak” what it’s doing in a way all can understand. Love the comment about thinking about your code like APIs as well. Very cool analogy.

LikeLike

- Jessica Baker August 23, 2018 / 1:55 pm
  
  Of course there’s exceptions to every rule, and much of it comes down to familiarity, but I find it much harder to follow code with comments. There are definitely times when they are needed, but I find it easier to spot the “flow” and shape of the code when it’s not broken up by comments. I think by not falling back on them, you’re forced to work on making the actual code clearer, a pretty good bug repellent, especially given that comments can become obsolete or inaccurate much more easily than code. I tweeted this thread alongside the article about alternatives to comments that you might find interesting : https://twitter.com/JessBoneBreaker/status/996756330844127238
  
  LikeLike

Jessica Baker

Software Engineer