Just as I was trying to write a good intro to this post, Jimmy Bogard tweeted:
I've felt that frustration myself many times. I work on large software systems and often have to troubleshoot hard-to-replicate, data-specific defects given only an error message and limited access to the production environment. Turning this limited data into an actionable bug report can be very, very difficult.
This experience has shown me that there are two types of programmers: those that intentionally craft code that it is easy to debug, and those that don't. Programmers that don't do this are, unfortunately, incredibly common and incredibly costly to an organization. Don't be that guy/gal whose code everyone hates to debug!
This post explains some coding techniques that will make your systems easier to troubleshoot and less costly to maintain. Use them; your team will love you for it!
What does "defensive programming" look like?
"Defensive Programming" refers to a collection of coding techniques that decrease maintenance costs by surfacing defects as early as possible, and by making them easy to troubleshoot. There are many articles on this topic, some arguing for and against it, and I encourage you to read them for additional insight.
Specifically, defensive programming means that you:
Write clean, simple, intent-revealing code
This is a universal requirement, I don't care if you're coding defensively, offensively or somewhere in the middle. The easiest defect to fix is the one that never occurs, and simple code is less likely to contain defects than complex code, so keep your designs as simple as possible.
(If you don't agree with this statement, stop reading and go play in traffic... your team will thank you!)
Assume inputs are tainted until proven otherwise
Most applications need data to function and many programmers make assumptions about their data, such as "this string will never be empty" or "this value will always be positive".
Unfortunately, that string can be empty in some cases, and that value will be zero at some point in time. If you don't validate your assumptions before using the data then you risk intermittent, hard-to-troubleshoot errors.
Therefore, do sanity checks on your input BEFORE you use it. Use a "design by contract" tool like Code Contracts for .NET if you can, or do it manually if you must. In any case, validate your input before you use it and display a helpful error message if validation fails. (See below for more on helpful exceptions)
In addition to making these errors easier to diagnose, treating all input as potentially hostile is also a security best practice. Sanity check your data and make both your teammates AND your security team a little happier!
Fail early, with useful messages
This is as important as it gets.
Imagine you get an error report that says "Sequence contains no elements". What do you do next? If you're lucky enough to get a stack trace then you can trudge through the code looking for the offending line, but what happens if the offending line contains multiple statements chained together?
Now imagine the error report says "Could not obtain order items for order 1234; sequence contains no elements". You haven't looked at a single line of code yet, and you already have way more information about the problem!
Same goes for null reference exceptions: Would you rather see "Object reference not set to an instance of an object" or "Cannot calculate sales tax for order 1234; Tax Calculator object was null"?
The key principle here is that you should anticipate errors that might occur and throw exceptions that provide key debugging info directly in the error message:
- Help the programmer locate the statement that failed and understand WHY it failed.
- Include key pieces of data needed to reproduce it: order ID, customer ID, etc. (Obviously, be careful not to expose identifiers that could compromise the security of your system!)
Ask yourself, "if this occurs in production 6 months from now, what pointers would I need to zero in on the problem?" and then include those pointers in the exception.
Use "fail safe" default values, where appropriate
In many cases, invalid data may not necessarily require an exception. For example, ask yourself these questions about each variable or statement you write:
- Can I treat null strings the same as empty strings?
- Can I treat null sequences (lists, arrays, etc) the same as empty sequences?
- If a string parsing fails, can I substitute a default value instead of throwing an exception?
If the answer to any of these questions is "yes" then use the null coalescing operator or conversion helpers to convert null or invalid values into something less "exception prone". I rarely need to differentiate between null and empty sequences so I've written an .ToEmptyIfNull() extension method that I use whenever I need to iterate over a collection. Major reduction in null reference exceptions for negligible effort.
Of course, sometimes you DO care about differentiating between null and empty, or ensuring a parse succeeds. In those cases just throw a helpful error message (see above) as soon as you detect the problem.
"Future proof" your program flow
I've seen a lot of defects occur when business conditions change, and something that "could never happen" when the code was written suddenly becomes possible.
- When you write a switch statement, always include a default branch. It's better to have the default branch throw an exception like "not implemented condition 'FOO'" than silently fall through and cause a potentially harder-to-debug error. (Of course, you do your best to avoid switch statements, don't you?)
- When you have a chain of if/else-ifs, always include an else branch. If it should never be reached, throw an exception that explains the conditions that occurred and why you expected them to never happen.
- If you're dealing with combinations of different states or variables, and certain combinations "should never occur", go ahead and handle those combinations anyway. It's better to throw an exception you can control than to let the system fail on its own. (For example, "Order 123 has status SHIPPED, but IS_CANCELLED was true; is the update service malfunctioning?")
Go, make the world a brighter place!
Using these techniques can help you avoid errors in production and can make it easier to resolve errors that do occur. Using them will bring joy to the hearts of men and will make you beloved amongst your teammates. Use them; do it for the children.