Where automated testing should (and shouldn’t) fit in your testing strategy

automated-testing

Having worked with automated testing in various forms for a few years now, I’ve come to a few conclusions about where it fits in the mindset of a software developer—and about how it can become a crutch in programmer behavior if we fit it improperly.

Testing vs. checking

For a while now, something’s felt strange to me about the way that automated tests are often marketed to the development community. For one, I think the term “test” itself is quite misleading. It conjures up this idea that we’ve found a viable replacement to manual testing; that—if all tests pass—we are guaranteed bug-free software. But, automated tests aren’t really testing a system at-large, so much as they are checking specific behavior within a system.

For example, suppose I want to test out the security around our billing page. I could write a couple of integration tests, by using a tool like Selenium, to confirm the following assumptions:

  • “If I login as an account owner, I should be able to access the billing page.”
  • “If I login as a normal user, I should not have access to the billing page.”

These tests confirm what we expect to happen actually happens, but they don’t confirm that what we don’t expect to happen (because we aren’t testing for it) actually won’t happen. For instance, neither test would prove that going to the billing page doesn’t, say, inadvertently kick off a process that deletes your user account, or the infinite number of other possibilities that could go wrong. We aren’t testing that a system actually works; We are merely checking that specific assumptions we made aren’t broken.

The same is true for unit tests. While they test very narrow and well-defined functionality, they aren’t going to test for, as Uncle Bob states, “the stuff out at the boundaries of the system”.  Also, with unit tests, there is often the talk of “100% code coverage.” James O Coplien argues that it’s pragmatically impossible to achieve if we define this as:

…having examined all possible combinations of all possible paths through all methods of a class, having reproduced every possible configuration of data bits accessible to those methods, at every machine language instruction along the paths of execution.

If we truly achieved this, he argues, we’d have on the order of trillions of scenarios to test for even modest software.

So, where does this get us? I often visualize automated tests like building a frame (the tests we write) around a constantly morphing structure (the true behavior we want). We can nail lots of posts together to build a frame that begins to resemble the behavior we want, but we’ll never quite get there.

asdfadfsadf sadfsafdsfas df

Writing automated tests provide basic boundaries for bug-free code, but are by no means an end-all solution.

So, what’s a better way to describe these things? The term “automated checks” feels much more appropriate to me, and others agree. As James Bach writes, there is a big difference between what humans can do and what automated tools can do.

In the Rapid Software Testing methodology, we distinguish between aspects of the testing process that machines can do versus those that only skilled humans can do. We have done this linguistically by adapting the ordinary English word “checking” to refer to what tools can do.

With that said, Bach comes up with a differentiation between software testing and software checking.

Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc….Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.

Automated tests are really automated checks. And, checking is just one small part of a complete testing strategy. It’s one kind of insurance policy amongst a whole suite of policies designed to make bugs more unlikely (but, certainly, not impossible) to introduce into a system.

Maintaining your own human vigilance against bugs

So, what’s the big deal about calling a “check” a “test”?  When we start becoming overly confident in our ability to write bug-free code simply based on a successful series of automated tests, the returns begin to diminish—rapidly. Our minds start to let go of some of the natural deliberateness we may have once put into code prior to the feeble safety net of automated tests. At its worst, it means we become narrowly-focused developers, using one relatively brittle measure of success as false justification that our testing is now complete.

If we were to truly test every possible scenario in even a modest system, there would quickly be orders of magnitude more tests to write than would be practical. We best rely on a series of other strategies—in addition to automated checks—to cover all the other permutations.

The TDD debate and the real question we should be answering

Is test-driven development (er, check-driven development) a good approach to writing code? If you read the tech pundits today, most everyone has a very strong opinion on the matter. But, I think we’re asking the wrong question. Asking if you should be employing TDD is like asking a basketball coach whether they should play a 2-1-2 zone, 2-3 zone, or man-on-man defense. They can all work and they can all not work. I’ve seen quality software written both with and without a TDD approach.

The real question we should ask is, how do we pragmatically make the introduction of bugs as improbable as possible, (while weighing all the other important matters of software, like new features, deadlines, etc.)? To me, it starts with the right programmer mentality.

We should be talking just as much about these other things, because they are just as important, if not more important, than automated testing:

  • Writing expressive code. Many bugs and points of confusion can be mitigated by focusing on code clarity. As Phil Karlton once said, naming things is one of the two hardest skills in programming. Devoting more time to precise and meaningful naming can be a big benefit.
  • Peer code review. Have your colleague review code you’re going to check in to play devil’s advocate. (The Git pull request concept is a great facilitator for this). They will almost certainly see your code from a different angle, and they’ll spot things you might not have been looking for.
  • Writing in small iterations. Get in the habit of committing incremental changes to your source control repository. This helps someone else visualize how you got from point A to point B, and makes it easier to see where things might have gone wrong.
  • Use the software you write. This sounds so simple, but is actually a thing most of us don’t really do fully. So many obvious bugs would be caught if we took off our development cap periodically, closed the debugger, and used the software as a consumer prior to committing any code. Even with the luxury of a dedicated QA team, this is something every developer should put into their own arsenal.
  • Think through all the possibilities.  Automated checks are only as good as what they test for. So, we should spend more time discussing what to test. I’ve written about knowing what to test in a couple prior posts:

To be clear, automated checks have a very important place in building sound software. But we need to be careful with our expectations. Placing too much value into them might give us a false sense of security making us even more prone to introducing bugs. We need to make sure that our long-tested human processes still play a majority role in our overall testing strategy.

Ka Wai Cheung is the original creator of DoneDone and author of The Developer’s Code. Follow him personally on Twitter via @developerscode and read more at Life Imitates Code.