Before we can make the case that microtest TDD is an effective change strategy, there’s a few high-level aspects of it that we need to highlight. We tend to take these for granted in our case, but newcomers won’t already know them.
More even than before, geekery seems irrelevant. We’re living the natural outcome of a politics of hatred & deceit. This content is one way I find respite, and maybe it will help you a little, but…
Stay safe. Stay strong. Stay angry. Stay kind. Black Lives Matter.
Today, I’m writing for the newcomer. It’s going to be a little rambly, because the ideas aren’t partitioning a single conceptual level into its component parts. Think of me as going over a text with a yellow highlighter, bringing out some points newcomers might not readily see.
Twinning
When we do TDD, we are working in a single textual codebase that we use to produce two entirely separate applications, each serving a different purpose. The one app will be the one we ship. The other app will be a primary tool for producing the first app more quickly.
The two apps, let’s call them the shipping app and the making app, use many of the same parts of that textual base. Normally, the making app uses a superset of the parts of the shipping app. But these parts are combined and arranged and used in significantly different ways.
The shipping app uses its parts to make a program that users can do domain stuff with: accounting, content-management, machine control, graphics, and so on. Back in the ’80s, cavemen like myself worked only & entirely with the shipping app, and its parts were all the parts.
The making app takes those same parts, plus others, to make a program that developers can do developer stuff with: primarily, taking the same parts the shipping app uses, exercising them in isolation, and telling us whether or not they do what what we wanted them to do.
Tho the shipping app could be doing practically anything, the making app really just does one thing: put a shipping part under the microscope, exercise it, and tell us about it.
That’s handy, because it means we can use the same "other parts" we mentioned, over & over, w/o a lot of custom code. We don’t write a brand new making app for every shipping app, but primarily use the same pre-fab making-app-parts from one to the other.
The most notable differences between the making app and the shipping app are these:
- the making app is fast.
- the making app’s interface is tightly tuned to its purpose.
- the making app is based on a small well-defined library or framework that we don’t re-roll each time.
The making app, by the way, runs right on the developers box, regardless of where the shipping app runs. In fact, most grown-up development environments know how to run that app and show its results right out of the box.
Alternation
For each (interesting) part of the shipping app, the developer defines a corresponding part of the making app. During development, the developer is continuously bouncing her focus back and forth between these two parts, one from the shipping app, one from the making.
The developer-added parts of the making app are sets of tests. They are short triplets of code of the form [arrange,act,assert]. Arranging puts the shipping part on a slide in a known state. Acting tells that shipping part to do something. Asserting sees what it does in response.
Alternation between the making part and the shipping part is so frequent and well-defined that, again, most grown-up development environments use a single keystroke to flip back and forth.
Isolation
The shipping app arranges parts in what we call a dependency graph, with one part depending on another, and it depending on others, and so on. It never violates those dependencies. The making app actively seeks to isolate each shipping part from its dependency graph.
Isolation of parts is critical to the usefulness of the making app, and it’s one of the hardest ideas for newcomers to wrap their head around. Failures to grasp this idea are a primary cause of TDD adoption failures.
This isolation is why the making app runs so fast, which adds to its usefulness, but it’s also why the making app is precisely informative, which is the essence of its usefulness.
There are myriad techniques for achieving the isolation, and isolation is normally used only selectively. There is no formula for the right amount of isolation, which means that humans have to use their judgment to decide. This is why mastering TDD takes time.
Iteration
The microtests in making app and their subjects in the shipping app are developed interactively over time. We don’t sit down and write the tests then sit down and write their subjects. We don’t sit down and write their subjects, then sit down and write their tests.
TDD is "test a little, code a little, test a little, code a little", iterated over and over again. TDD style is inherently evolutionary, incremental, and iterative.
When it’s going well, these TDD iterations are quite short in duration: just minutes for each cycle.
It is possible to test too much without coding enough, or to code too much without testing enough. TDD’ers have specific advice & suggestions for keeping those parts in balance.
One aspect of all this that troubles newcomers: the tests themselves change during the iteration, and not just by addition. It’s not uncommon to have to make a change to a test you already wrote and passed, because the subject has changed its behavior or its interface.
We’ve no room to go into how & why this local inefficiency translates directly into global efficiency. But it does, and we’ll see a few hints of how it works when we get to the real case.
Indirection
Microtest TDD does not directly prove that the shipping app is what the customer wanted, or even that it works. It indirectly establishes the base from which we are more likely to achieve those goals: it proves the shipping parts do what the developer thinks they do.
This is the Pieces Premise said another way, really. What we’re saying is that, to satisfy the customer we have to put shipping parts together in the right way to do the right thing. If the shipping parts don’t do what we think they do, we’ll never get to satisfying the customer.
I’ve learned many things doing TDD these twenty years, but one of the most startling things I learned is this: the overwhelming majority of software defects come from simple one-liner brain-o events.
Programmers translate ideas from their minds into highly structured highly detailed imperative text. This kind of work is fraught with opportunities for the most trivial sorts of mistake.
These are exactly the kind of mistake that we, when we’re speaking & hearing, work around rapidly at seemingly no cost. That’s our humanness at work. But they are also exactly the kind of mistake that computers aren’t capable of working around.
And that is why we need to prove first that our code does what we want it to, before we determine whether it does what they want it to.
Okay, in conclusion, that was as predicted, a bit of a ramble.
But I felt we needed just a little more thickness to our description of TDD before we lay out how it represents an effective strategy for change.
The ideas of Twinning, Alternation, Isolation, Iteration, & Indirection are laced through the actual practice of TDD, as opposed to any simple pseudo-algorithm for doing it. I wanted to convey a sense of the feel of it.
Next stop: How microtest TDD works as a change-strategy.
Supporting GeePaw
If you love the GeePaw Podcast, consider a monthly donation to help keep the content flowing. You can also subscribe to get weekly posts sent straight to your inbox. And to get more involved in the conversation, jump into the Camerata and start talking to other like-minded Change-Harvesters today.
I understand the Pieces Premise very well, I find myself TDD in such a way most of the time. But I find it a partial solution due to changes in the pieces as a result of requirement changes.
TLDR: I think integration and system level tests have an important role in making sure we hook everything up and that changes don’t break things.
Now in a more detailed way:
If we test component A assuming component B does something, and then component B changes the behavior, unit tests might pass while the system might not work. This becomes even more prominent when using mocks to test. The B mock in the A test mocks the expected behavior of B. If we change the real behavior of B and miss changing the mocked behavior, we’ll get wrong tests for A. And in a large system this can happen quite easily just because we don’t always remeber all the places B was used, and not always look for them. And even if we look, we can make wrong assumptions and miss things while looking.