Unit Tests are Parity Bits for Code

I was recently asked by a skilled programmer:

I have never understood the rationale for unit tests. It seems like yet more code that can have bugs in it, so you would need unit tests of your unit tests.

I think a good analogy for understanding unit tests is to think of them as parity bits for code. In data communication, parity bits are added to a message to detect transmission errors caused by noise.

Just as with unit tests, you might ask: “Doesn’t adding a parity bit just increase the chance of error? If I’m sending an 8-bit message, won’t an extra bit just increase the chance of error by ~12%?” It is true a parity bit slightly increases the message’s “surface area” for raw errors, but it automatically eliminates half of all 9-bit messages as being incorrect. In practical terms, this means the parity bit can detection all 1-bit errors, and many multi-bit errors. That extra parity bit easily pays for itself many times over by detecting damaged messages.

Unit tests serve the same function as parity bits. Unit tests are not designed to prove the code correct (extremely difficult), nor to exhaustively test the code for all inputs (combinatorial impractical). A unit test is simply there to detect the most common “1-bit errors” in code caused by “noise.” Of course, unit tests themselves can contain errors, but just like the parity bit, they can detect many times more errors then they might introduce.

So what is the source of “noise” in code? In data communication, noise is caused by physical things like electrical interference, cosmic rays, patchy fog, etc. that produce random errors in messages.

In code, noise is caused by programmers mutating code. Of course programmers are not random mutators, and ideally most changes are benign and beneficial. However, the complexity of computing systems almost guarantees that some changes will cause unintentional, nearly random effects, i.e. code noise.

Code noise can be external or internal changes to the code being tested. External noise is changes to things like library versions, compiler options, platform or architecture. Internal noise takes the form of ill-conceived optimizations, half-baked re-factorings, out-of-control search-n-replaces, etc.

In this view of the world, when you check-in a piece of code it is really an act of communication. You are transmitting the code to a programmer in the future (often yourself 6 months from now). During transmission, the code will face many mutations, some of it damaging code noise. To help ensure that the code’s message arrives un-garbled and usable, a unit test is added to automatically detect simple “1-bit” errors.

Parity as a unit test:

function message() { return "Hello World!"; }

function testMessage() { assertEqual( parity(message()), EVEN ); }

Want more about Parity bits and Unit testing? Stay tuned for the followup: “Unit Testing is for Farmers”

Wanted: Comment Redaction Plugin

Don’t trust code comments over 30 hours old. Inevitably the comment isn’t updated after code changes, resulting in confusion and bugs. In fact, I wish every IDE (Xcode, Eclipse, etc.) had a feature/plugin that would redact comments whenever code is changed, in order to force comment revision.

Here is how it would work. Given some commented code from a game:

 /* Add a fixed bonus */
 score += 100;

Any change to the code would immediately redact the associated comment with X’s:

 /* XXX X XXXXX XXXXX */
 score += time_left * 2;

Alternatively, it could reduce the comment to an Mad-Libs style fill-in-the-blank exercise:

 /* _verb_ a _adjective_ _noun_ */
 score += time_left * 2;

Either way, the coder would be forced to rewrite the comment to match the new code. This would encourage short comments, or better yet, no comments.