The truth about TDD
7 points by _sdegutis 6 years ago | 1 comment- mikekchar 6 years agoThis is pretty far down the page. Possibly only a discussion between me an you ;-) I enjoyed your article. Thank you for writing it. However, I have a different point of view which you might find interesting.
There is a difference between TDD and testing. Testing is something you do to see if what you've done is correct. Quite frequently I work the way you described and it's very nice to feel confident that I have implemented what I set out to implement.
At that point, I often just throw the test away. Sometimes if I have a pair programming with me, they'll be shocked. "Why did you throw away that perfectly good test", they will ask. "Because I wrote it to find out if the code works and now I know that it works".
TDD is about something else. With TDD you aren't so much worried about the correctness of the behaviour, you're worried if the behaviour has changed since the last time you touched the code. Ideally, if it has changed, it would be nice to know what changed. Even better, it would be nice to know what assumption was violated that caused the change in behaviour.
Imagine that you have an application that you have tested inside and out. It works perfectly. Next imagine that you have a special magical device that tells you if the behaviour changes every time you modify the code. It doesn't tell you if the behaviour is correct, it simply tells you that it changed, what the change was, and what caused it to change.
Now every time you add code, if the behaviour doesn't change, you know that it is still operating correctly -- because it was before and it hasn't changed. If the behaviour does change, you can observe the behaviour and decide if the change is good or bad. If the change is good, then the code is still operating correctly. If the change is bad, then the code is operating correctly except for the change. If your magic device also tells you where your assumptions are violated, then it becomes easy to decide how to make the behaviour good again.
There are a couple of really cool things about this magic device. It doesn't need to know if the system is operating correctly or not (which is difficult in most cases and impossible in the general case). It just needs to know if the behaviour is the same as before (which is a much simpler problem). The other really cool thing is that if you make a change to the behaviour of the system and the magic device doesn't detect it, then you know the magic device is broken. Since the magic device is very useful, it is probably a good idea to fix it right away.
Of course the magic device is a test suite. But it's important to understand that it's a very special kind of test suite. It measures behaviour and detects if the behaviour changes -- not if the behaviour is correct (you can test that separately, either in a manual or automated fashion). Often in legacy code I'll write a whole bunch of "tests" by running functions with various inputs and recording the results. That's all I need. I don't need to know if the results are correct or not. As I modify the code and watch the differences, I can determine if the differences are good or not, and I may find bugs. But bugs are not my main worry with this style of test suite. I'm using the test suite to inform (and later confirm) my assumptions about the behaviour.
Second, this kind of test suite needs to tell you what is actually wrong. For example, you might have a test that determines if a particular result was produced. If the test "fails", you might say "the test fails". This is pretty useless, though. Now I have to go and debug the code. Instead, I want to be told exactly what is different between what I expected and what I received. I also want to be told what context the program was in when I got the result. So I need to be able to see at a glance the input, output and processing (big hint: fixtures, as convenient as they are, are usually bad because you can't see the input).
Thirdly, this kind of test suite needs to tell you where in the code the code the problem is likely to be. If you change the behaviour in one place, then ideally a single test will fail. This underlies the difference between a test suite for testing correctness and a test suite for testing changes in behaviour. If you were testing correctness and you have the same behaviour in several different contexts, then you would expect to have several tests to ensure that the behaviour is correct in each context. With this kind of test suite, you actually want to test once and then simply ensure that the same code paths are followed in each context. Then when the behaviour changes, you only get a single failure -- in the tests related to the behaviour.
When writing these kinds of tests, you will find something a bit strange. In order to ensure that a test fails whenever behaviour changes, you need 100% test coverage and also 100% branch coverage. This means that you need to be able to write tests that hit all of your code.
If you want to ensure that only one test fails whenever the behaviour changes, then you will have to break up your code so that you have access to just the functionality that you need.
What you will discover is that you are doing white box testing and not black box testing. Because you actually want to see what the implementation is doing. The best analogy I've heard of is that it's like putting watch points in a debugger. Then when the code is running, you can inspect the state of the code to see what it's doing.
This is where the real kicker is: it forces you to expose state rather than hide it.
And I'll leave that in a paragraph by itself, because the implications are huge for design. Quite frequently we choose to abstract away state. Sometimes it is not even measurable. "As long as my function returns the correct value (no matter how complicated it is to generate), there is no need to see the inner workings. In fact I don't want to see the inner workings because it will add complexity to the system. I want a simple API and I'll push all the complexity inside it". This is the antithesis of TDD, because then we can't inspect the running state of the system and pinpoint where behaviour changes (except in really large chunks).
Obviously, there are lots of really good advantages to abstraction and encapsulation. However, there are ways of providing that while exposing state (at every level) at the same time. Essentially, you are forced to create a many layered system and you are forced to think hard about those layers because...
There is another important result of TDD (or at least of the TDD I describe above). I mentioned that you have to be able to see at a glance the contexts that you are creating when you are writing the test. This means that it must simple (and concise) to create the collaborators that you need to create that context. This, in turn, means that you have to reduce the complexity of the inter-dependencies of collaborators and you especially need to be able construct the dependencies clearly. Things like global variables and singletons become big liabilities in this kind of environment.
The result of all of this is that you need to have collaborators with simple dependencies (low coupling), you need to have many layers, where the state of the system is always expressible, and you need to have simple interfaces on your systems.
The most important thing to understand is that TDD does not design your code for you. Of course, you need to do that yourself. Rather it provides a series of constraints (as a result of always being informed of the behaviour of the system at all levels), which enforces certain good qualities in the resultant design.
Having said all that, I know many people who do not like designs that follow the constraints that TDD enforces. They prefer to have larger, more complicated interfaces. They prefer to have APIs that hide the inner behaviour and make it impossible to examine the state of the code (without actually running it in a debugger). This is clearly a choice. However, I'm bumping up to about 20 years of doing it the TDD way and my experience has been that a system designed in that fashion is more flexible, easier to read and reason about, and considerably easier to work with. As always YMMV.
I hope that provided some insight into a different way to look at TDD.