Entity Framework and SQL Server Compact Edition

Microsoft has really hit a home-run with Entity Framework (EF). Getting it to work well with SQL Server Compact Edition (Sql CE) was a bit of a challenge for me, though.

What I love about Entity Framework is that once you get used to few good data model design principles, building a set of useful .NET objects to work with that data model is as simple as saying “yes, please.” You go into your data project and add a new item of ADO.NET Data Model type and tell it to build the model from a database and you’re done.

If the database is constructed well, the Entity model will be very straightforward.

There are a couple gotchas that I spent some time on.

The first gotcha took me a while, although I guess it didn’t need to. Identity columns don’t work with EF and SqlCE. This might be getting fixed in a future version, but until then we have to use a workaround. One commonly mentioned approach is to use uniqueidentifiers and GUIDs generated in .NET code and inserted into the primary key field in the database. But think about this from a performance perspective and you’ll see it is a nightmare. SqlCE doesn’t support clustered indexes, so the only way to get similar performance is to use a primary key that works like a clustered index.

Clustered indexes, for those who aren’t familiar with how they work, sort the physical database records according to the index field. So if the index key is not sequential as the records are inserted, all of the records in the table will need to be shuffled frequently to keep them in index order. This is a whole lot of work. A non-clustered index has  the same problem, just less of it. Think about an index as an internal database-use-only table that helps look up rows (by row ID) in a table based on key values. If the key values in an index are inserted at the same time as the rows they point to then the index can grow without doing any maintenance. But if the key values are out of order, the database will have to continuously maintain the index. This is just useless work for something arbitrary like primary key values that would normally be identity columns.

So probably the worst substitute for an identity column is a GUID. Why? Because GUIDs are not remotely sequential. GUIDs solve the problem of not being able to assign a unique ID at object creation time. GUIDs are pretty much certain to be unique. But GUIDs make a mess when used as primary keys because they are so random.

My solution is to use a long/bigint numeral that is essentially a timestamp. The .NET code to generate one is dead simple:

    var SortOfGuid = long.Parse(DateTime.Now.ToString("yyMMddHHmmssffffff"));

This only uses two digits of the year but could easily include a century digit. And as long as two of these are not created in the same millionth of a second, they will be unique. Using a long generated in this way performs very well in my meager testing so far. Much better than GUID was faring.

The other gotcha is fairly straight-forward. Commonly DBAs will make many-to-many tables in a properly normalized database that have an identity as a primary key. This is not helpful in Entity Framework because it gums up the object relational map. Omit the identity and EF can tell what you want. However, you need to make sure your associative table is basically one big unique index by setting a composite key as the primary key on the table. SQL I’ve got that does this for one of my tables:

    create table PictureTag (
        PictureId bigint not null references Picture(Id),
        TagId bigint not null references Tag(Id),
        primary key (PictureId, TagId)
    );

So with this little create table we set up the two columns as foreign keys and set a primary key on both of them. Works well with EF because now Picture and Tag will be able to have Picture.Tags and Tag.Pictures and we don’t have to think much about it.

Anyway, not that these were the worst problems in the world, but because of them I almost gave up on EF and SqlCE together. I wasn’t going to give up SqlCE in favor of Sql Server for my project, and after having these issues with EF I spent some time hand-coding my domain model and data access code. And wanted to tear my hair out. In EF all I had to do was point the code generator at my database. I wanted that kind of ease and assurance that the resulting code wouldn’t have trivial bugs in it. Hope this helps.

Eye Opening Experience

Let’s try a little thought experiment. Close your eyes. Now surf the web. In your mind’s eye, imagine the dulcet tones of a screen reader helping you navigate your favorite sites. A little clunky getting used to browsing by keyboard (like in the old days), but once that hurdle is past, it’s smooth sailing. Point your browser to your own web site (that’s ctrl-L and type the URL and press Enter, for those of you who have your eyes closed but don’t remember the accelerator). I’m sure your web site is accessible because you followed all the guidelines, right? Isn’t this fun!

That’s what I thought, too, until I tried this experiment for real using a screen reader called NVDA and my own web site, Super Fun Brain Busters. The experience was… eye opening.

I thought I’d been doing pretty well by employing “best practices” for accessibility. I’ve been testing my website in the Firefox web browser with the NoScript add-on and style sheets turned off (View -> Page Style -> No Style) and it felt like I was on the right track.

I did a good job with getting alt tags on images, on not depending on JavaScript, on not relying on CSS, etc. I got my content in a useful order so that the user wouldn’t have to listen to a bunch of repetitive text on every page in order to get to the good stuff.

However, my site is a games site. The one game I’ve got so far should be easy enough to play without looking at it. But it’s not. At all. The problem is that the game relies on using a series of images and image buttons to display cards in a hand. With decent alt tags these are “accessible”, but as a way to play this game they are not accessible in any meaningful sense of that word.

So now I’m going to have to get creative and figure out ways to improve the interface. I’m also going to have to spend more time with my eyes closed and the screen reader on. No amount of imagining you know what it’s going to be like for a blind user can replace actually using your application the way they will.

Have Ubuntu Linux Lost Their Damn Minds?

Yes. Ubuntu Linux have lost their minds. Thankfully I had already stopped using Ubuntu as my primary desktop operating system before they proved how crazy they really are.

I have moved to Windows 7. For the first time I feel that Microsoft has delivered on a solid user interface coupled with the other system level improvements (like security) that they started pushing out with Vista. Microsoft has also started to do some really sane things, like provide free of charge versions of their software development and server tools. Visual Studio and SQL Server express versions are actually a decent substitute for the Enterprise versions in most respects. Certainly they are as good or better than most free tools available for development of hobby software. With certain add-ons, like XUnit and Git, a developer seeking to avoid paying a fortune to write software outside the office, can get a fully “best practices” compliant setup.

But I came here to talk about Ubuntu and their particular brand of crazy. Only crazy people would do something like move the window management buttons in a window title bar from the right side to the left side (yes, the buttons to minimize, maximize, and close a window are now on the other side of the title bar). But that is exactly what Ubuntu has done with their 10.4 release. This is crazy! This demolishes one of the most important aspects of user interface usability: memory.

How many of us “power user” types can actually visualize the display on our monitors at any given point in our daily routine with the computer? How many of us can actually walk through a troubleshooting session over the phone without a computer in front of us? The answer: lots of us. The reason: our memories. We have a powerful ability to recall to mind the way things looked and felt to us at some point in the past. We also have a powerful ability to imagine how things might look by forming consistent mental models of an imagined interaction based on our memories.

A software developer who changes the way an interface works, especially something as important as moving window control buttons to the other side of the title bar, better have a damn good reason for making the change.

Recently I rewrote my Wordzy game (a cross between Yahtzee and Scrabble). As part of switching my home systems to Windows, I also decided to move from Ruby on Rails to ASP.NET for my web development efforts. As part of the rewrite process, I changed the UI in some significant ways. In the original version I went a little AJAX crazy and set it up so that a user drags cards from their hand to the playing field to build words. This was pretty slick and I liked the way it worked (although it was buggy at times). The user had a lot of immediate control over where letters got played in the word he was building.

But when I rewrote the game I decided not to keep this feature. Instead, the user clicks the cards in the hand and they pop down to the word he is building in the order they are clicked. This wasn’t an arbitrary decision, though. I had a very good reason for making the change. I wanted the game to be more accessible and mobile friendly. So I feel that any adjustment my users have to make is worth being able to expand my games audience. At some point I may be able to layer the old drag-and-drop behavior back on top of the current base, as well.

The science here was simple. The game simply did not work without a graphical interface and the ability on the client to drag and drop elements on the page. No additional research or user focus groups or anything like that was necessary to determine that a change was needed to adjust the software to a new set of requirements.

But in the case of Ubuntu moving window management buttons I can’t imagine what the reasoning is. The new location is more like Apple’s OS X interface I guess, but it’s nothing like the default interface Ubuntu has provided since day one. So if their goal is to provide a more comfortable environment for people who are switching from Macintosh computers, I suppose this might work. Or maybe they are doing this so that people who work on both Ubuntu and Mac don’t have to spend quite as much energy context switching. But to push the change on long-time users of Ubuntu and those whose other OS is Windows… not nice. Crazy, actually.

But this is standard operating procedure at Ubuntu. Every six months they radically alter the look and feel of their default desktop install. Sometimes it’s just a coloring change, but more often icons get a “facelift” so they no longer look familiar. All of this is slightly jarring, but serves as a good visual cue that something has really changed here. But to move window management buttons isn’t like that. It means that months and years of practice sending a mouse pointer to a specific location on the screen is now useless. In fact, worse than useless, because it’s the wrong spot entirely.

And so while I sympathize with a desire to appeal to a wider or different audience, I can’t say I like this change. I definitely like the idea of being able to choose which side of the title bar the buttons are on. People do have different needs and expectations based on what other systems they might use during the day. But don’t just change it on your existing users! At the very least, give me the option during the upgrade process to weigh in on it. Otherwise I have to hunt down the right options for changing this back to the way I’m used to. And I probably will be stuck doing that with every future upgrade, since there’s no system preference for this setting.

Have they really lost their mind? Probably not. I can find rational reasons for the switch. But it is still jarring and uncomfortable to those of us who were not expecting it and for whom it is not a welcome change.

My point here isn’t really to complain about Ubuntu. They provide a very good free operating system to the world at large. If my own professional career as a programmer weren’t focused on Microsoft and .NET, I doubt I would have felt as much of a desire to try Windows 7. If I hadn’t already switched from Ubuntu to Windows for other reasons, the cost of switching operating systems would have been much, much higher than the cost of switching desktop themes. Or I could have used the new button placement until my memories adjusted.

Rather my point here is to remind myself, and my readers, that changes to the user interface can be very uncomfortable for users and should always be offset some sort of long-term gain for the users. In my case, users who like my game will be hopefully be able to play it on a wider array of browsers, platforms and devices. And users with accessibility issues will also be able to join the fun. I may be able to bring back the drag-and-drop interface someday, too. But these kinds of changes have to be done with the user in mind. In the case of the change Ubuntu made, it seems like they made the change to increase their appeal to slightly different user base. Or maybe they had other reasons. Perhaps it was just the lead UI designer’s whim. My hope is that it was intentional and supported by rational decision-making.

The old saying, “If it ain’t broke, don’t fix it” is a good one to keep in mind in software development. We have a tendency to want to “fix” things that are working just fine because they weren’t done “correctly” (meaning “my way”). And behind the scenes, that is fine, as long as the change isn’t noticed by users at all. But the interface is a different story. Some things are broken from the start. For example: using radio buttons labeled “yes” and “no” is definitely broken by design. The proper interface control for yes/no questions is a checkbox. This sort of thing should be changed immediately because the user has memories of radio buttons and checkboxes from other applications and those memories will help users (especially new users) be more comfortable with your application faster.

More on this topic tomorrow… I myself am guilty of far worse interface changes than Ubuntu. You might say I’ve lost my own mind, too.

Some Updates

I am in the process of moving all of my old Above Average blogs from a custom Ruby on Rails app I wrote to this here WordPress thing instead.

The fun part is being forced to look back at some of the stuff I wrote just twelve to fifteen months ago.

Probably the most significant change for me over the past year is that I’ve gone completely over to the Dark Side. These days my day-to-day home OS is Windows 7. I program full-time in C#, ASP.NET, and Transact SQL– both in and out of the office. What hasn’t changed is my passion for software craftsmanship.

The Goldilocks Principle

There’s some pretty loaded words flying around the programmer blogosphere these days. I can understand why. We’ve probably all had the thought “if that other programmer would just do it my way, I would be able to meet my own programming goals a lot more easily.”

If you’re a minimalist, you probably got furious at some “architecture astronaut”. If you are a big design kind of person you probably got angry at the test-driven, iterative guy. And it’s easy to resort to name-calling and insults and blame to preserve your self-esteem. For every left there’s a right. Sooner or later it’s bound to be the case that there will be some differences of opinion.

More than likely what’s really going on in these cases is that the other guy was under too much pressure or not enough pressure. How many of us have had jobs where the code never made it anywhere near the real world? I’ve written more than one killer app that will never get used by anyone… it was so gold-plated it wasn’t even funny. The pressure to produce real value was unbelievably low. In that kind of environment the worst sort of feeping creaturism is allowed to flourish, both inside and outside the code. The architecture astronauts can take over, since the code never really has to do anything, you’ll never know if it doesn’t.

At the other end of the spectrum I’ve written code that was needed in production, like, yesterday. And 90% of the existing code base was written by people who were either too pressed for time, or not pressed enough, so you get this terrible mix of spaghetti “do it now” code and overblown super frameworks that only ever got used to solve one specific business problem, but require you to jump through insanely complicated hoops to code up a bug fix or small feature addition.

You can’t really apply the world’s best design principles in this situation because you’d have to raze the existing code to the ground. You know that’s not going to happen. In these cases you won’t have much ability to do a good job of test-driven either. The system is next to impossible to test. Either you’ve got interdependent spaghetti that can only be tested as a HUGE black box, or you’ve got a framework so messy that setting up the test takes 64x more code than you can type in the two hours you have to get the fix in before the Senior VP of Destroying Your Career hears from the customer that there’s a problem.

So you wing it, compile it, drop it on the server and pray to the gods. It’s either that or go home. They’ll find someone else to do it if you don’t and you need the money.

So far, the only thing I’ve learned from all this discussion is that some programmers figure out how to work in places where they share a development philosophy with everyone else who works there. The mythical 20% top programmers are not all in their ideal world jobs or can’t afford to say no to clients on general principle, though, so even those of us who write above average software are going to find ourselves in this pickle.

And frankly, I like the way the debate has polarized, because I find myself looking at both sides and thinking, “you guys are so wrong”. It’s strange for me, because I’m usually off in left-field, but this time I’m right in the middle, thinking pragmatically that there is a happy medium. It’s not just about hiring smart people that get things done, because most of us don’t have any control over that. Plus, I’m sure we all think we’re smart and that we get things done.

So what to do? At the end of the day, follow your gut. In my case, the gut tells me that the more test-driven I can make my approach, the more likely I’m going to get the right answer sooner and keep my software working right once it’s built. So that’s where I part ways with Joel and Jeff, because they don’t seem to really understand how this works. They might pull back a bit from the “don’t test” argument here and there, but it’s clear that they have never really tried test-driven in a way that keeps clear of “Agile” baggage.

To my mind, suggesting that developers work without doing test-driven development is like suggesting we abandon version control and simple build processes. These last two are on The Joel Test (at #1 and #2 respectively). The list also includes at #10: “Do you have testers?”

If we have human testers, shouldn’t we try to make their jobs easier by writing as many automated tests as possible? Up front and perhaps even before we write the code that will be tested? Maybe the testing staff and the programming staff could work together on establishing what the test data and test acceptance criteria will be. Then, once unit test and application code have been coded, the testing staff can focus on being more than button clickers. Most testers are really smart, talented professionals who deserve to do more with their lives than run tests that programmers could automate in minutes as part of their programming work.

On the other hand, some of the design principles folks scare me because they might have tests, but they also have these elaborate frameworks. I don’t care much if the framework has 100% code coverage unless the tests we’re talking about also relate directly to business value.

Read the requirements. Write a test that matches one of those requirements. Test fails. Write the code. Test passes. Simplest possible thing that can work. Allow the framework to evolve from working code. Build code in chunks of work that can be finished in less than four hours each. I see the patterns stuff as a major distraction, since usually it ends up getting baked in way too soon. You end up with all this code that is there to support what you might need. You ain’t gonna need it.

I also think it’s important to find ways to work with the tools that you love. I always load Cygwin on my Windows systems at work, because I can’t take the constant system shock I experience when moving from my all Linux home systems back to the office. I’ll gladly toss Ruby on there, too, so I have a programming language that I’m more comfortable with for certain systems tasks. The key advantage being irb, the interactive Ruby console. It’s not about overriding local technology decisions, it’s about my avoiding having to figure out a bunch of new tools all the time.

Finally, we have to participate. In the larger programming culture and the one in our office. It won’t always be comfortable, and occasionally there will be more loaded words flying around than we can handle calmly. Handling disagreements and conflict is a separate topic, one I probably shouldn’t pretend to know anything about.

In the larger community, I guess part of me doesn’t care if you want to spend your time building overblown architecture frameworks, or if you’re just a cowboy coder whipping out code faster than… um, fast stuff. I believe the shortest distance between two points is straight line, and that if you take the time to graph those two points out, and draw the line with a ruler, you’ll get the best results in the least amount of time. So part of me is happy to see my competitors (that’s you guys) do things the “wrong” way. :)

Of course, there are lots of times that these conflicts make their way down to the individual workplace. And if you think tempers run hot in the larger community, just wait until actual jobs, promotions, and raises are on the line. However, at this level it is no longer a problem for the programmers to solve themselves, it’s a management issue. Techniques for building strong, powerful teams is way outside the scope of this article, so I’m going to close with a pointer to a really good book on building teams that I just finished reading.

In the meantime, what can we do as programmers? I’m going to suggest something called the Goldilocks Principle: It’s not too hot. It’s not too cold. It’s just right.

The trick isn’t to hold fast to the Agile™ religion (or any other process/methodology). Nor to proceed with wild abandon. Nor to build elaborate frameworks around single use cases. But to keep the good design principles in mind, test first as much as you can, get things done, try to leave time to refactor and reconsider later, and keep the eye on the prize: working software that delights the customers.

SQL: The Dys-Functional Programming Language

There’s nothing quite so good as SQL when it comes to tripping up otherwise above average developers. First, a lot of programmers don’t take SQL seriously as a programming language. Second, SQL is probably the worst programming language that almost everyone uses.

I’m guessing that the pure awful-ness of SQL as a language is part of the reason that a lot of programmers don’t invest any more time in SQL than absolutely necessary. But the bigger part of the problem is that it’s not the same kind of programming language most programmers are used to.

While SQL is technically a declarative programming language, I prefer to think of it as a functional language. It also has a strong procedural element, especially when you consider that pure SQL is rarely used. More likely the SQL one is working with is a dialect specific to the relational database management system (RDBMS) being used, like Transact-SQL on Microsoft SQL Server or PL/SQL for Oracle. Those dialects were created specifically to extend SQL in procedural ways.

The biggest difference between SQL and other common programming languages is that SQL isn’t intended to compile down to machine code or byte code that would be distributed separately from the RDBMS. So it’s easy for programmers to minimize SQL, which is exactly what happens. This trend is furthered by the existence in most organizations of a separate database architect (DBA), either one wizard-ish individual or a team of database guys. The DBA handles all the heavy-lifting around the database and in some organizations the database team even writes all the stored procedures… basically giving all the other programmers an API into a data system, but relieving them of any pressure to understand what’s going on behind the curtain.

So SQL really isn’t like other programming languages in a lot of ways, but almost every programmer has to use SQL at some point. Hopefully the average programmer took at least one database class in college, so they know what normalization is, and what the fundamental ideas of the relational database are about. But it seems that the average programmer doesn’t have a computer science degree (I don’t). If they do have a degree they probably didn’t get as excited about the database class as they did about the class on compilers or 3D graphics programming.

Given that the average developer will work with SQL at some point, it is important for those seeking to be above average to take SQL to heart at some point and really learn about the wonderful world of set-based, declarative programming and the procedural extensions to SQL that are available on whatever RDBMS they happen to have handy.

I put this in the same category of professional development as the Pragmatic Programmers advice to learn one new programming language a year. The Seven Habits guy calls this sharpening the saw. SQL, and the type of thinking about data and data processing that using SQL well requires, is definitely good for expanding your ability to get the computer to do your bidding.

The problem with learning SQL is that SQL is one of the most dysfunctional functional programming languages ever. Look at the state of the database world. Every RDBMS has its own very distinct dialect of SQL that corresponds to features only available in its engine. Which is fine, feature competition is how software evolves. But at the core of that problem is that the original language was not very strong, and has now been extended. You don’t build a house on sand, I think the saying goes.

Additionally, SQL long ago stopped being a direct expression of anything the RDBMS was going to use to perform a query or other DB operation. And you can see this easily by looking at the execution or query plan for your SQL code, something the common RDBMS allows. In my mind, the fact that none of the major RDBMS’s out there provide any alternative languages to SQL is a problem.

If you’re going to be compiling what I write in SQL down to some other byte code or AST type of thing, how about giving me some more options than just the local flavor of SQL? You know the saying about lipsticks and pigs? Whether you’re using Oracle, MS SQL Server, or MySQL, you’re still basically getting the SQL pig. It’s right there in the product name in most cases!

Why do I harbor such negativity about SQL? I am going to give you one really good example, OK, two.

First: it’s bass ackwards. Declaring all the stuff you want before you declare where to get it from is backwards. It’s obscures the relationship of input to output in the worst way.

 SELECT *
 FROM my_table

Should be:

 FROM my_table
 SELECT *

LINQ follows this convention, in part because MS wanted Intellisense to do its thing well (which it can because when you type a collection name first, the IDE can figure out what fields will be available and help you type them later). But also in part because it made writing the LINQ libraries easier. And if you look at a typical query or execution plan for some SQL code, you’ll see it does it this way under the covers… so why not make it explicit?

The second problem is that the syntax of even the most core SQL language that you will find on every RDBMS in existence is just plain stupid and inconsistent. Compare the following two statements:

 INSERT my_table (id, this, that)
 VALUES 1, 'foo', 'bar'

 UPDATE my_table
 SET    this = 'foo',
        that = 'bar'
 WHERE  id = 1

That’s insane. They’re as different as night and day. But is an insert really that different from an update? I don’t think so. In one case we update an existing record with new values. In the second we create a new record with new values. There’s absolutely no reason the syntax needs to be so different.

The other problem with insert is that the destination field name and the source value are really far apart in the code. If the insert statement is adding something like twelve values to a row, it can be difficult to tell whether the value list and the field name list match up. Also, if you’re like me and you like to keep code clean looking, a clean looking insert takes up twice as many lines as a clean looking update does because you need one line for a field name and one for the value.

This is an easy fix:

 INSERT my_table
 SET    id = 1,
        this = 'foo',
        that = 'bar'

This is a massive improvement I think. With this style of insert I don’t have to guess which value in my VALUES list corresponds with which value in the parenthesized list of field names to insert to. I can also envision a MERGE command that looks just like this, and combines the best of both worlds.

 MERGE my_table
 ON    id
 SET   id = 1,
       this = 'foo',
       that = 'bar'

This command would look at the SET values for matches on id within my_table. If it finds a match, set this and that to the values in the expression. If it doesn’t find a match, make a new record with that id and set this and that to the values in the expression.

There are some kinks here, like how to handle id fields that are autogenerated. But it only takes about a second to see that you either default to ignoring the value of id if id is an autogenerated field that is not matched, or you add a keyword to make that behavior explicit and raise an error when someone’s code tries to manually set a value on an autogenerated field.

 MERGE my_table
 ON    id
 SET   id = 1 AUTOIGNORE,
       this = 'foo',
       that = 'bar

So, I think it’s pretty obvious by now why SQL is treated as a tool of last resort for the average programmer. The language is outside the comfort zone in the first place, and once you do use it, it’s not terribly friendly. I’m not even taking into account here the lack of top-notch IDEs for SQL, which is a whole ‘nother article in itself.

Unfortunately, SQL is a necessary evil. Data, and therefore a database, is at the heart of many, many applications these days and SQL is the tool we’ve got to interact with that data and the database. Hopefully, soon, there will be a serious alternative.

I Can’t Believe We’re Still Having This Discussion

I think of Joel Spolsky as a fount of wisdom and sage advice. But his latest article on the topic of test driven development (TDD) strikes me as a bit clueless. Thus sayeth Joel, “I feel like automated testing of everything, a lot of times, is just not going to help you.”

Joel has some astute analysis on the more “out there” aspects of the “agile” and TDD communities. But he also observes, “I don’t know, I’m going to get such flame mail for this because I’m not expressing it that well.”

His analysis of some of what has arisen as part of the whole “agile” fad is spot on. In practice, the so-called “agile” method is just business as usual with new names, different make-work project management tasks, and an environment in which developers get away with taking a long time to deliver underperforming code. So I can see why he’s concerned.

In software development the core goal is to deliver software that adds more value than it took to create. But software, by its very nature, is so good at delivering more value than it takes to build that the industry is really just now getting to a point where we need to be able to quantify more carefully value add against cost. It used to be much simpler.

Example: Form letters. Computers are ridiculously good at personalizing form letters. They’re so good that it’s useless to attempt to quantify the cost of developing a basic text editor to edit letter templates in, an address and demographics database to store personalization data in, and some application to marry the two.

Using the old typewriter approach it would take days for an army of typists to produce one-off letters for every person in the list. They solved the problem by lowering their standards, copying 100s of letter pages with blank spots where the personalization would go. This resulted in ugly, obvious form letters. And it would still take a minute or two per page to put the blank in a typewriter, line up the page, and then type the address or whatever off of some list (which list also had to be painstakingly copied out by hand).

By contrast, even older computers can generate enormous quantities of such letters in minutes (and then some time to run the printer). They can generate envelopes at the same time. The computer will also fill justify the text to the right margin beautifully. As long as the original text is typo free, every single letter will be typo free. You get the idea. The problem is so easy, that the computer will simply make it go away.

And with this sort of programming task, testing of software was not even really necessary– not the way we think of testing today. The requirements were pretty simple and the ultimate test– does this software do what we want– is easy to evaluate. At the same time, computers, and by extension programming languages, didn’t really have the capabilities (memory, speed, whatever) to make it easy to test the software via things like unit tests. So a couple of generations of computer programmers learned to master the art without ever writing anything remotely resembling the modern unit test.

So what is this modern unit test? Wikipedia says unit testing is:

… a method of testing that verifies the individual units of source code are working properly. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual program, function, procedure, etc., while in object-oriented programming, the smallest unit is a method, which may belong to a base/super class, abstract class or derived/child class.

Now if we follow the test-driven development (TDD) approach of writing such unit tests, running them to ensure that they fail, and then writing the minimum amount of code to make the test succeed, we will easily achieve 100% code coverage. It’s easy. I’ve done it. Lots of people have done it.

However, 100% coverage via unit testing of this type is just not that valuable. And I think this is what Joel is objecting to.

First, 100% test coverage of code under test doesn’t mean you’ve correctly and completely captured the business requirements.

Second, code generators produce code, but not meaningful unit tests. Example: with .NET’s datasets, you easily end up with a bunch of code you have to write tests for to get to 100% coverage. All it takes is referencing one table or stored procedure as the base for your dataset that has fields your code will never access but which are in the table or proc for someone else’s benefit. The autogenerated code will contain a bunch of functions related to those fields and your coverage will drop dramatically. Now instead of getting the benefits of rapid application development via the IDE you have to waste your time going back in and deleting auto-generated code to get coverage back to 100%. Or writing meaningless tests to exercise that code. Bleah.

A third trouble spot for 100% coverage is stored procedures or triggers or other code that lives in the database. As it stands, there aren’t really tools to help you ensure that both executable code and database code are covered by your tests. My guess is that Microsoft will get there first, if they aren’t already close.

A fourth and particularly pernicious issue with coverage is dynamic programming and metaprogramming. If part of your code is expressed as a big text string in code that then evals/compiles that code and runs it… Your code coverage tool isn’t going to be much help deciding whether the code in the string was actually executed.

These are the biggest ways I can think of to get less value from code coverage measurements. You either get false negatives, where code doesn’t get executed because it really never needed to be there in the first place. Or you get false positives where there’s more going on behind the scenes that the coverage tool can’t detect.

So, I’ll agree with Joel. Something is wrong with the religious approach to 100% unit test code coverage. And Joel actually gets it right when he says, “I might do more black-box tests, sort of like unit tests…”

Bingo! The problem for Joel is that he’s already equated TDD with unit tests and 100% coverage. And I guess that’s not totally incorrect, after all, that’s why Dan North stepped to promote the idea of Behavior Driven Development. Personally, I’ll stick with the TDD acronym, because the “Test” in Test Driven Development, doesn’t specify unit tests. And I think it’s very important to emphasize automated testing as integral to the process.

In Joel’s article he gives a hypothetical situation where 10% of the tests break when you move a menu or something. He uses this hypothetical to seemingly argue against TDD. But I disagree. You don’t need to stop unit testing or using TDD. If 10% of your tests break because you move a menu, you need to learn to write better tests and probably better code.

One way to improve the value of your automated tests is to stop thinking in terms of functions, subroutines, and methods. Remember, the goal of writing software is to get a computer to add more value than it takes to create the software.

So in real life, unless you are one of those rarified library designers, you are probably writing code that is supposed to add value in a very specific way or ways. Whether agreed on by committee over a period of months or being decided one-by-one by your on-site Agile-loving customer, the specific value adds are requirements. Each and every requirement is something you can write one or more automated tests for.

So, of course, you should start by writing test(s). They look and feel just like unit tests. They have setup and teardown. They have calls to your code’s classes, objects, functions, and procedures. Once you understand a chunk of the requirements, you write some tests to express that understanding. Then you write the least amount of code necessary to pass the tests. Your code coverage tool will presumably give you back a 100% rating. If it doesn’t, you probably wrote some extra code. Or you used a code generator or something else. But a quick analysis will show what areas of code are not covered and if it’s code you wrote by hand, you should figure out a way to test the behavior you coded for because it is part of the requirements or you need to remove the useless code because it contributes no value at this point.

Once you start this way, you quickly learn not to write code in advance. You also start to think about ways to keep the code as flexible as possible overall. Requirements change, and when they do your tests will change. And they will fail. Which means you need to change your application code as well.

So do you need to write tests for all the requirements at once? Of course not. Do you really need to have 100% code coverage? Of course not. Does Joel have a valid point? Of course he does. But should you walk away from what he’s saying and think “oh yeah, test driven development isn’t worth the effort”? Absolutely not.

An above average developer is already doing test driven development, even if they don’t automate the testing process. The ultimate test is the user acceptance test. When someone runs the code does it do what they asked you to make it do? Ultimately, you are developing against that test. Good developers keep in mind what is necessary to pass the ultimate “user” test, and tend not to worry about the other stuff.

Of course, one of the best ways to ensure that your code passes the user acceptance test is to bake automated testing into your development process. If you are doing test driven development, you have to really make sure you understand the requirements so you can write your tests. The automated tests you get with this process provide a high level of comfort that new requirements, changes in requirements, refactoring existing code for performance or other reasons, and adding on new code won’t cause your application to fail the ultimate test.

Going in to work on code that has no unit tests, without writing any tests in the process, is basically performing acrobatics without a net. It might look great, but the falls can be dangerous. Test driven development doesn’t guarantee perfect code. But when you shift the thinking from 100% code coverage to 100% requirements coverage, the tests you write will improve. And setting an arbitrary level of code coverage required at that point is mostly make-work. It’s a good feedback tool for the developers, but not a meaningful milestone in and of itself.

The current evolution of tools that take automated test suites and actually generate requirements documentation from them is very exciting. I doubt it’s useful to try and build tools that write tests from requirements docs (although maybe someday…). But having some way for business analysts and managers to review the automated tests that are in place, in terms of requirements coverage, is going to provide a lot more value than giving them a graph that says, we have a suite of 95 tests that cover 96% of our code.

In my mind, if you don’t have an automated test strategy and start by writing tests using that strategy, you are putting the cart before the horse. By not writing tests first, you’re saying, “I don’t really care if I meet the requirements, just so long as it’s close enough.” But we’re entering a time where “close enough” is not going to sustain software development investment. We need to nail the requirements and waste as little time/code doing it as possible. Time is money and coding takes time. Using a test driven development strategy means focusing on the business requirements that add the most value. And expecting a high level of code coverage for those tests means less time wasted writing code that doesn’t actually help meet the requirements.

Delightenment

People like the Pragmatic Programmers and Steve McConnell have put a lot of thought into the question of how programmers can do well. Books like The Pragmatic Programmer and Code Complete are considered required reading among programmers who have a passion for the craft.

What I like about these two books, aside from the fact that they fit together extremely well as a prescriptive overview of best practices for coders, is their focus on quality. Code Complete is geared towards the inward, team-facing aspects of software quality, while Pragmatic Programmer is more focused on overall process, especially the parts involving users and customers. So the former offers advice on writing code that will delight your fellow programmers, and the latter is filled with tips for delighting your users. There is some overlap. And also some glaring omissions.

Following on the heels of this sort of “best practices” writing, some top-notch fellows (including the Pragmatic Programmers themselves) got together and wrote up a manifesto. But even before the ink on the Agile Manifesto was dry, there were groups turning “Agile” into The Next Great Methodology™. In the process the key values were lost. Perhaps the reason is that the writers of the Agile Manifesto did not set the bar high enough in the first place.

The Agile Principles begin with “We follow these principles: Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.” At its core, this can be summed up as “Highest priority: satisfy customer”. Frankly, that’s not good enough. Sure, satisfied customers are better than unsatisfied customers. Even unsatisfied customers are better than no customers at all, I guess. But I don’t think a word like “satisfaction” is helpful.

Ever take a customer satisfaction survey? Rate your satisfaction with this product on a scale of 1 to 5 where 1 is totally dissatisfied and 5 is completely satisfied. On this scale, a 3 is smack in the middle and means the customer was, in fact, satisfied. But neither were they delighted. But being willing to accept a 3 here… is a bit like electing a C student president. Maybe we can do better. It’s no wonder Agile is the newest buzzword for “business as usual” in most places. The manifesto is a bit too self-effacing. Not to mention long.

There are some great parts in the manifesto, but they are all too easily ignored. Part of the reason for this is the name of the manifesto: “Agile”. That makes it sound like we’re nimble, maybe running sweaty through a jungle while being chased past a waterfall by a snarling Requirements Beast. Does that really sound like a fun, rewarding activity focused on the best way to delight users?

I propose a much simpler manifesto. One that significantly raises the signal to noise ratio.

Delightenment: The customer must be delighted. This requires a project team who are on the journey to delightenment.

It’s simple. The more delighted the project team are to be working on the project, the more likely they are to delight the customer.

There have been oodles of pages written about how to get software development teams to perform well, but they all boil down to a single core principal (which is buried mid-way through the Agile Manifesto and not reflected even remotely in the name). The core principal is: Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done. So simple. Yet so difficult.

There’s a lot of choices to be made along the way, but ultimately the only choice for success is: make work a delight and the work will delight others.

This focus on delight is critical. And needs to pervade the project and development organization. Miserable programmers who can’t afford to make even minor mistakes will not delight the customer. A scared or unhappy developer will do the minimum required to meet the project plan milestones. They will not take chances and they will not care much about whether the customer is satisfied, let alone delighted. Then, when the customer is not even satisfied, the programmer will put all his effort into demonstrating that the fault is not his (the easiest way to do this is to interpret all requirements documentation as literally as possible– “See? Works as specified!”). So what will it take to delight the customer? Delighted programmers.

How do you delight programmers? I’m not going to go into specifics. It’s easy to find examples of places most programmers would love to work: Google, 37Signals, FogCreek, even Microsoft (especially Research). Easy enough to find out what those guys do, how they do it… and I daresay that the success of the companies I’ve listed is due in large part to their focus on their development staff.

Those of us who don’t work at one of those prominent companies can still benefit from paying attention to how they go about the business of developing great software. As developers we can’t always wait for our managers to figure out the best way to delight us, either. If the company doesn’t seem interested in group delightenment, are there things we can do to make work more delightful? If we ever want it to get better, we have to start it ourselves.

And it’s not like we need to achieve total delightenment overnight. It’s a journey. If we think about delightenment as a competitive advantage, it doesn’t take much:

Two sorority sisters were out camping. One evening while they were warming their toes next to a campfire, a bear wandered into their campsite. One of the coeds reached for her hiking shoes and started putting them on. The other young lady said to her, “What are you doing? You can’t outrun a bear.”

“I know,” the first girl replied, “but I don’t need to outrun the bear– just you.”

Participate

Like hundreds of others, I have been watching the creation of Stack Overflow since the first announcement of the project. Being a long-time fan of both Jeff Atwood and Joel Spolsky I was thrilled to find them pairing up and doing something like a programming Q&A web site. I must admit that I did not attempt to get into the private beta, and for a long while I stopped listening to their podcast. But I did venture onto Stack Overflow the day it opened, and yesterday, after some time away, decided to return for a look. The site is pretty sweet.

On Dec 14, 2000, I answered my first programming question on a programming Q&A site. In this case the site was PerlMonks and the question had to do with what a regex did. Since then, I have on and off spent quite a bit of time on various sites or mailing lists devoted to programming Q&A, both asking and answering questions.

I learned over time that the process of setting up and writing my question usually helped me answer my question, either because I wanted to demonstrate due diligence or because writing the question up is a form of talking out the problem. Attempting to explain the problem makes us clarify the situation in our own minds first. So over time I have found myself asking fewer questions directly. What I have found, however, is that I often find the answers to my questions when I Google with some fairly exact text related to my question. For instance, if I get an obscure message, I copy the message directly into the search box. This almost always works.

The reason it works is that there is a community of programmers all engaged in the act of Q&A. And usually the question I have has already been asked and answered. So even if I never ask a question directly, I may be helping those who helped me by answering others’ questions. Obviously it’s not a direct return of the favor. But when the information is shared freely in a way that can be indexed by Google, every programmer benefits in the long term.

At some point I stopped spending time on PerlMonks. I did subscribe to ruby-talk, though. The mailing list format is not nearly as satisfying as the community web site approach found at PerlMonks, but it was good enough. After a few good years with Ruby, I eventually made the mistake of taking a job as a .NET developer (which is a story for another day).

After being involved with the Perl and Ruby communities, I know I did not feel motivated to participate in any VB or C# community web sites or mailing lists. All of the programming Q&A sites that I found where .NET developers went were all abysmal in some way. Either they were ugly, hard to use, the answers didn’t have any quality control, etc, etc. So while I might have Googled for stuff once in a while, I certainly didn’t answer questions or participate in any virtual groups. There is also the fact that .NET comes from a huge company that rakes in a fortune from its customers. What could possibly motivate me to help them make sure their customers are happy? Plus, I didn’t really want to be working with Microsoft technology. Even after about four years of working with it, I’m still not that interested in it.

And with some additional thought, I’m actually thinking that it’s a good thing I’ve mostly stayed away from the C#/.NET community. After all, I’m what is known as a Free Software zealot. At this point I think I’ve proven that I’m very pragmatic, after all, I’ve been doing .NET professionally for a while. But I look at what it means to do Q&A stuff… and there’s more to it than this nebulous sense that I’m giving back. It’s also very selfish. I just didn’t know it.

Every Ruby user that has a question that I can help answer is a Ruby user whose Ruby project will be more successful. Which means there’s one more person out there touting the wonders of Ruby– and probably mentioning how great the community is, too. Sooner or later, people who make decisions about technology will have heard of Ruby often enough and usually in a positive light, which means they’ll be more likely to go with Ruby in the future. That’s good for those of us who might be looking for work as Ruby developers. I know the number of Ruby jobs (especially outside of the Rails world) is small… but if helping someone else can expand Ruby’s reach, I’ve done myself a favor, as well.

Finally, all those answers I’ve handed out over the years, many of which can be found with a Google search on my name and keywords like “Ruby”, make what I think will someday prove to be an excellent reference. In fact, I know people have Googled me in the past when I’ve been looking for jobs, and the biggest problem was that I hadn’t made it clear enough that it was me they were finding with those Google searches.

So. One conclusion I’ve reached is that it’s important to participate in your community. That may not mean doing local user groups or conferences or anything social in that way, but it most certainly does mean being part of the online world devoted to the technologies you care about and the wider world of computer programmers who give a damn. I’m starting this web site as a way for me to participate.