I think of Joel Spolsky as a fount of wisdom and sage advice. But his latest article on the topic of test driven development (TDD) strikes me as a bit clueless. Thus sayeth Joel, “I feel like automated testing of everything, a lot of times, is just not going to help you.”
Joel has some astute analysis on the more “out there” aspects of the “agile” and TDD communities. But he also observes, “I don’t know, I’m going to get such flame mail for this because I’m not expressing it that well.”
His analysis of some of what has arisen as part of the whole “agile” fad is spot on. In practice, the so-called “agile” method is just business as usual with new names, different make-work project management tasks, and an environment in which developers get away with taking a long time to deliver underperforming code. So I can see why he’s concerned.
In software development the core goal is to deliver software that adds more value than it took to create. But software, by its very nature, is so good at delivering more value than it takes to build that the industry is really just now getting to a point where we need to be able to quantify more carefully value add against cost. It used to be much simpler.
Example: Form letters. Computers are ridiculously good at personalizing form letters. They’re so good that it’s useless to attempt to quantify the cost of developing a basic text editor to edit letter templates in, an address and demographics database to store personalization data in, and some application to marry the two.
Using the old typewriter approach it would take days for an army of typists to produce one-off letters for every person in the list. They solved the problem by lowering their standards, copying 100s of letter pages with blank spots where the personalization would go. This resulted in ugly, obvious form letters. And it would still take a minute or two per page to put the blank in a typewriter, line up the page, and then type the address or whatever off of some list (which list also had to be painstakingly copied out by hand).
By contrast, even older computers can generate enormous quantities of such letters in minutes (and then some time to run the printer). They can generate envelopes at the same time. The computer will also fill justify the text to the right margin beautifully. As long as the original text is typo free, every single letter will be typo free. You get the idea. The problem is so easy, that the computer will simply make it go away.
And with this sort of programming task, testing of software was not even really necessary– not the way we think of testing today. The requirements were pretty simple and the ultimate test– does this software do what we want– is easy to evaluate. At the same time, computers, and by extension programming languages, didn’t really have the capabilities (memory, speed, whatever) to make it easy to test the software via things like unit tests. So a couple of generations of computer programmers learned to master the art without ever writing anything remotely resembling the modern unit test.
So what is this modern unit test? Wikipedia says unit testing is:
… a method of testing that verifies the individual units of source code are working properly. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual program, function, procedure, etc., while in object-oriented programming, the smallest unit is a method, which may belong to a base/super class, abstract class or derived/child class.
Now if we follow the test-driven development (TDD) approach of writing such unit tests, running them to ensure that they fail, and then writing the minimum amount of code to make the test succeed, we will easily achieve 100% code coverage. It’s easy. I’ve done it. Lots of people have done it.
However, 100% coverage via unit testing of this type is just not that valuable. And I think this is what Joel is objecting to.
First, 100% test coverage of code under test doesn’t mean you’ve correctly and completely captured the business requirements.
Second, code generators produce code, but not meaningful unit tests. Example: with .NET’s datasets, you easily end up with a bunch of code you have to write tests for to get to 100% coverage. All it takes is referencing one table or stored procedure as the base for your dataset that has fields your code will never access but which are in the table or proc for someone else’s benefit. The autogenerated code will contain a bunch of functions related to those fields and your coverage will drop dramatically. Now instead of getting the benefits of rapid application development via the IDE you have to waste your time going back in and deleting auto-generated code to get coverage back to 100%. Or writing meaningless tests to exercise that code. Bleah.
A third trouble spot for 100% coverage is stored procedures or triggers or other code that lives in the database. As it stands, there aren’t really tools to help you ensure that both executable code and database code are covered by your tests. My guess is that Microsoft will get there first, if they aren’t already close.
A fourth and particularly pernicious issue with coverage is dynamic programming and metaprogramming. If part of your code is expressed as a big text string in code that then evals/compiles that code and runs it… Your code coverage tool isn’t going to be much help deciding whether the code in the string was actually executed.
These are the biggest ways I can think of to get less value from code coverage measurements. You either get false negatives, where code doesn’t get executed because it really never needed to be there in the first place. Or you get false positives where there’s more going on behind the scenes that the coverage tool can’t detect.
So, I’ll agree with Joel. Something is wrong with the religious approach to 100% unit test code coverage. And Joel actually gets it right when he says, “I might do more black-box tests, sort of like unit tests…”
Bingo! The problem for Joel is that he’s already equated TDD with unit tests and 100% coverage. And I guess that’s not totally incorrect, after all, that’s why Dan North stepped to promote the idea of Behavior Driven Development. Personally, I’ll stick with the TDD acronym, because the “Test” in Test Driven Development, doesn’t specify unit tests. And I think it’s very important to emphasize automated testing as integral to the process.
In Joel’s article he gives a hypothetical situation where 10% of the tests break when you move a menu or something. He uses this hypothetical to seemingly argue against TDD. But I disagree. You don’t need to stop unit testing or using TDD. If 10% of your tests break because you move a menu, you need to learn to write better tests and probably better code.
One way to improve the value of your automated tests is to stop thinking in terms of functions, subroutines, and methods. Remember, the goal of writing software is to get a computer to add more value than it takes to create the software.
So in real life, unless you are one of those rarified library designers, you are probably writing code that is supposed to add value in a very specific way or ways. Whether agreed on by committee over a period of months or being decided one-by-one by your on-site Agile-loving customer, the specific value adds are requirements. Each and every requirement is something you can write one or more automated tests for.
So, of course, you should start by writing test(s). They look and feel just like unit tests. They have setup and teardown. They have calls to your code’s classes, objects, functions, and procedures. Once you understand a chunk of the requirements, you write some tests to express that understanding. Then you write the least amount of code necessary to pass the tests. Your code coverage tool will presumably give you back a 100% rating. If it doesn’t, you probably wrote some extra code. Or you used a code generator or something else. But a quick analysis will show what areas of code are not covered and if it’s code you wrote by hand, you should figure out a way to test the behavior you coded for because it is part of the requirements or you need to remove the useless code because it contributes no value at this point.
Once you start this way, you quickly learn not to write code in advance. You also start to think about ways to keep the code as flexible as possible overall. Requirements change, and when they do your tests will change. And they will fail. Which means you need to change your application code as well.
So do you need to write tests for all the requirements at once? Of course not. Do you really need to have 100% code coverage? Of course not. Does Joel have a valid point? Of course he does. But should you walk away from what he’s saying and think “oh yeah, test driven development isn’t worth the effort”? Absolutely not.
An above average developer is already doing test driven development, even if they don’t automate the testing process. The ultimate test is the user acceptance test. When someone runs the code does it do what they asked you to make it do? Ultimately, you are developing against that test. Good developers keep in mind what is necessary to pass the ultimate “user” test, and tend not to worry about the other stuff.
Of course, one of the best ways to ensure that your code passes the user acceptance test is to bake automated testing into your development process. If you are doing test driven development, you have to really make sure you understand the requirements so you can write your tests. The automated tests you get with this process provide a high level of comfort that new requirements, changes in requirements, refactoring existing code for performance or other reasons, and adding on new code won’t cause your application to fail the ultimate test.
Going in to work on code that has no unit tests, without writing any tests in the process, is basically performing acrobatics without a net. It might look great, but the falls can be dangerous. Test driven development doesn’t guarantee perfect code. But when you shift the thinking from 100% code coverage to 100% requirements coverage, the tests you write will improve. And setting an arbitrary level of code coverage required at that point is mostly make-work. It’s a good feedback tool for the developers, but not a meaningful milestone in and of itself.
The current evolution of tools that take automated test suites and actually generate requirements documentation from them is very exciting. I doubt it’s useful to try and build tools that write tests from requirements docs (although maybe someday…). But having some way for business analysts and managers to review the automated tests that are in place, in terms of requirements coverage, is going to provide a lot more value than giving them a graph that says, we have a suite of 95 tests that cover 96% of our code.
In my mind, if you don’t have an automated test strategy and start by writing tests using that strategy, you are putting the cart before the horse. By not writing tests first, you’re saying, “I don’t really care if I meet the requirements, just so long as it’s close enough.” But we’re entering a time where “close enough” is not going to sustain software development investment. We need to nail the requirements and waste as little time/code doing it as possible. Time is money and coding takes time. Using a test driven development strategy means focusing on the business requirements that add the most value. And expecting a high level of code coverage for those tests means less time wasted writing code that doesn’t actually help meet the requirements.