Geeks And Geekery: Why I Use These Words

okay, fine. geeks and geekery it is.

what do these words point at for me, and why do i use them?

you know, asking this, there’s no way in hell i’m gonna answer without wanting to share my decidedly minority views about language. you just *know* that. please be advised that i know that these are not mainstream views. please further be aware that i’m quite familiar with mainstream views, and have made a conscious decision that i don’t agree with them.

i use language to change my world. i always do. i never don’t. it’s rare for me to say always and never like this, and i’m skeptical of it, but this is as good a case for it as i’ve ever known.

i don’t believe that words are fixed points in a rigid conceptual space, and i don’t believe that reality can fit in sentences. i don’t talk to transmit information. i talk to change my world. sometimes it works, sometimes it doesn’t. i constantly adjust my language, to better or worse effectiveness.

so with “geek” and “geekery”.

“geek” is a slang word deriving from carnivals, travelling circuses, sideshows, and the dirty world of 19th century entertainment. a “geek” was a role, a job. someone would be hired to be the geek. carnies used the word to variously the describe the idea, the role, the job, and the person currently filling it.

a very popular sideshow act was “the wild one”. in its most archetypal form, this would be some unattractive fellow, with wild hair, dark skin, and flashing eyes. he’d be wearing animal skins, and would be made up to appear filthy and wild. he was demonstrably visibly human, and he was just as demonstrably an entirely natural other. he was wild. demonstrations included weird violence, a lack of language, again, via archetype: the eating of the head off a live chicken.

these “wild men” — they weren’t always men, but most commonly — were called “geeks” in the trade. a carnival manager would ask a prospective employee what he could do, and the person would say, “i’ve got a great geek act.”

so a geek, percolated down through the years, carries a very strong overtone of “uncivilized’.

there are at least two other words in english that approach this same meaning: “dork” and “nerd”. i use “dork” all the time, more or less interchangeably with “geek”. i don’t use “nerd”, because i don’t like how it sounds. it’s okay by itself, i spoze, but it doesn’t feel like a great rhythmic participant. it seems to break the rolling beat of any phrase that uses it.

wil wheaton said more or less famously: “what makes you a nerd isn’t what you love, it’s how you love it.” i believe this saying — wrong-worded as it is for me — is at the very heart of this “uncivilized” overtone. geeks, dorks, nerds, love the subject of their investment. they love it so much that people who don’t share their interests quite often find them uncivilized.

my wife, virginia, is a gifted artist. she’s got *all* the usual art areas in her domain, but her long-term specialty was working with clay. she makes clay shapes, decorates them, fires them, then gives them away or keeps them herself or sells them. i traveled around with her for a few years where she was showing her wares at various festivals. and we got to hang with quite a few of her peers.

one day, she ran into a woman whose work and work-experience was quite similar to hers, and they started chatting. i was manning the booth, and watching the two of them. they just *chattered* away for an hour or so. at one point, va is telling this long involved story to the other person, and she delivers the punchline, something like this: “oh,” he said, “i thought you said five *grams* of gerstley borate!”. and she and the other woman just absolutely *fell* *out* laughing about this.

and i’m watching all this. and i’m realizing, holy shit, i am watching two *pottery* dorks dork out together. i was tickled pink, of course. i had not the slightest clue — still don’t — about the actual subject of their session, let alone the meaning of that bizarre punchline. but i am intimately familiar with what they were doing, in every respect *except* the subject. they were being geeks, dorks, nerds, together.

a “geek” is someone who is highly creative, highly technical, and highly attached to being both.

earlier, tim offered a parenthetical comment of “(Dev Joy?)” as an annotation to my coinage “geek joy”. that’s fine. it doesn’t upset me or anything. for me, tho, geeks are geeks regardless of subject, and devs are devs because they develop software. when i refer to the geeks on my teams or in the trade, i don’t mean just the programmers. i mean all the people who are highly creative, highly technical, and highly attached to being both. when i say “geekery” i mean all the activities that these highly technical highly creative highly driven people do when they are making things.

we are — and i am clearly in this camp — “uncivilized”, in the sense that we resist any attempts to quell, sedate, mitigate, reduce, quieten, our creativity, our technicality, or our drive. “civilization” is not a dirty word, or anything like that. but so often — especially to someone who is a geek — it’s used as a way to suppress the profound love we feel for whatever the subject of our geek nature is. and make no mistake. geeks of all stripes are frequently bludgeoned with this civilization club. (heh: see what i did there?..) many of us have spent our entire lives being told not to be that way. we are sensitive to that history and hurt by it.

steve biko said, “the greatest tool in the hands of the oppressor is the mind of the oppressed.”

although the dark connotations of “geek”, “nerd”, and “dork” are fading and will continue to fade. they still shade it. it is not at all uncommon for people who feel meanly-labeled to then adopt and embrace that label. in so doing, they are saying two things at once to two different audiences.

to the non-geeks i’m saying this: “it’s your fucking word, i would have preferred ‘friend’ or even just ‘person’. and i am using your mean little word because i now know it can’t hurt me any more.”

and to the geeks — all the geeks, of whatever stripe or interest — i’m saying “hey. hi. i’m one of us, too! let’s go geek out together!”

i used to be kinda sorry i was a geek. i spoze i am still kinda sorry when i don’t fit, but a lot less so than i once was. nowadays when i encounter non-geeks, i’m less likely to be sorry for me than for them. they are missing out.

Awkward and Graceful 2 (A Recovery)

Continuing last night’s wobbly muse on graceful and awkward collaborators…

In the light of day, I think I see four possible responses to the situation when your new code depends on an awkward collaborator or collaboration:

1) Ignore it (possibly just for now). Write your test anyway and suffer the expense. This is a legitimate judgment, tho one the hardcore among us have to be dragged kicking and screaming towards.

2) Don’t automate a test for it. Again, entirely legitimate, and actually more common than just ignoring it for old hands, tho less so for noobs. Use old-fashioned GAK testing instead.

3) Re-arrange around it. Change your code so that the part you most want to test no longer needs the awkward collaborator. This is bog-standard hardcore TDD, extremely common.

4) Isolate and fake it. Use one of the several means to “for the purpose of testing”, not use the actual awkward collaboration, but some cheap simulacrum of it. Also extremely common — a lot of old hands might say it’s overused and prefer the third choice.

The third and fourth choices “re-arrange” and “fake” are by far the two choices most TDD’ers prefer. But I don’t know of anyone who never ever responds with “suffer” or “GAK test”. (GAK = geek at keyboard. It’s a geepawism to describe what most non-TDD’ers do to satisfy themselves that the coder and the code say the same thing.)

I want to reach back to the premises to show they relate to this.

The whole conversation is predicated on the money premise. If TDD were about intellectual purity and not shipping more value faster, we would always choose to ignore the awkwardness and suffer the owwie. Many new-to-TDD teams do just that. What then happens, sooner or later, is that the cost of doing the tests eventually outweighs their benefit. The team simply stops doing TDD. The process of stopping is spread out over a long painful time, and it tends to happen more or less insensibly. It begins when we start honoring team agreements around testing “chiefly in the breach”. The chaining premise is invoked by the “isolate and fake” choice and maybe partly by the “rearrange” choice. we are testing part of our system and agreeing to be satisfied that we’ve done enough to be “sure”. Of course, the choice itself has judgment premise written all over it. it takes a human using intuition, experience, technique, theory, the whole vague schmoosh of non-computable consciousness to see and decide on any of these choices.

And finally, re-arrangement suggests the steering premise, and if you’ve done it enough times it leads to pre-arrangement, which is the center of that premise. The reason the steering premise is so important is because it encapsulates and crystallizes so much TDD experience. We can eliminate or mitigate an enormous amount of awkwardness in advance, if only we’re alert to it and willing to address it.

And the morals of the story are . . . 1) Don’t start a muse too late at night. 2) Awkward collaboration is normal in real-world TDD. 3) Notice awkward collaboration and choose to do something about it.

Thanks, and have a pleasant day!

Awkward and Graceful Collaborators

Software Programs can be understood as (potentially huge) orchestras playing in concert.
Depending on your level of abstraction, you might imagine systems, subsystems, layers, packages, objects, or even functions as the individual players.

(Aside: Folks often make major distinctions in these abstractions, but to my touch, they feel all the same thing, just with ever larger labels on ever subsets. and at bottom, 100% of it is Von Neumann architecture code.)

For the purposes of this conversation, I’ll be thinking and using classes and objects, as that is the main level at which daily geekery is conducted in the modern synthesis. But remember, the players could be any of these.

The players collaborate to perform their function. They are collaborators. We often speak of roles or responsibilities – sometimes very formally – and that language is pointing at the same thing. Parts of programs work together: They collaborate. I pretty much always use TDD for anything bigger than the very rare “three hours, ship it, never look at it again” thing. I won’t rehearse the reasons for this, but just offer my official view: it lets me ship more value faster when I work that way.

Anyway, when TDD’ing, all collaborators are decidedly not created equal, and they’re unequal in a peculiar way. Let me throw a strange pair of terms at you: “graceful” and “awkward”. It’s not an all-or-none thing – but one of continuum, with some collaborators being very graceful and some being very awkward, and plenty in between, and some awkward sometimes and not others. TDD works for me because the tests are cheap. We’ve used this word before, and I used the phrase “easy to”. Easy to write, to read, to scan, to run, to change, to collect, to debug.

The idea is really important, “cheapness”, for a very straightforward and human reason: when things aren’t cheap, we don’t do as much of them. If the things have substantial extrinsic value, not doing as much of them is A Bad Thing[tm]. When I’m adding new functionality, my new stuff is usually merrily collaborating away, with my code, her code, library code, framework code, etc. etc. A graceful collaborator is one that makes or keeps my testing of the new functionality cheap. An awkward collaborator is one that prevents my testing from being cheap. (Remember as we go, continuum not binary value.)

The archetypal graceful collaborator in most modern OO languages is the string class. Many TDD’ers write their very first tests on code that manipulates strings. And the tests are chock-a-block with assertions that two strings do or don’t match. Awkward collaborators aren’t always the same from one programming environment to another. In java, the simplest awkward is usually the File class. In other worlds it might be different. A sampling of common awkwards: A ticks-since-1970 clock. A physical file, a relational database. Fronts for external devices, like screens or printers. Fronts for other complex programs, servers or report-writers.

What makes graceful “graceful” and awkward “awkward”? Well. Anything that makes me not want to use a test to validate that my code does what I meant it to contributes to awkwardness. And graceful is the absence of any of those things. Awkwardness can be in runtime, it can be in setup cost, it can be in difficulty obtaining the output data, it can be almost anywhere. In our material at Industrial Logic (cites at end), we just say “if it makes you not want to test it, call it awkward and do something about it.”

Strings are relatively graceful, because they require one-liner setup, one-liner probe, they have no externalities, they run fast, and for all but the noobs, their API is incredibly well-understood.

(Note: Don’t make the mistake of thinking that this is because strings are simple or have no dependencies. try learning every single corner of, say, c++ std::string. pro-tip: bring food, as you won’t be done before lunch — next week.)

A substantial part of doing TDD in the real-world hinges on this awkward/graceful distinction. In fact, the reason so many toy TDD exercises are in fact toys, is because they include few or no awkwards, where real-world TDD is replete with them. How one handles awkward collaboration both potential and pre-made, is at the heart of successfully using TDD to ship more value faster.

There are several different ways to address awkward collaboration, and no single one ring to rule all awkwards. About all I can do is offer some generic advice, for the moment:

1) We struggle mightily to keep the current class — the new functionality — from becoming the newest awkward collaborator in our collection.

2) We can very often divide the work our code does into an awkward part and a graceful part. Sometimes that’s as easy as splitting one method into two. The first method turns awkward into graceful, the second method takes graceful.

3) We nearly always use interfaces when an implementation is inherently awkward. Big girls and boys supply a graceful (if usually incomplete or simplified) variant of an awkward FOR NO OTHER REASON than to keep from passing on the pain.

4) Alternate constructors — one taking an awkward, one taking a graceful, are extremely common.

5) Indirection is your friend in that 95% of your job that has almost no impact on program performance.

6) The most important part of your code is usually not the bringing together of the data, but the processing of it. If we intermingle those modes, we often create permanent awkwardness.

7) Often, the many paths through my code all have a prolog or epilog that’s awkward, with pure grace in the middle. If I tie those into the code in a non-separable way, I’m locking everyone into my awkwardness for all time.

8) The smaller an awkward’s interface, the cheaper it is to fake. (Faking has lots of names, most commonly mocking, a misuse of that term’s origin and intent. It is making an artificial “test only” graceful from some inherently awkward thing.)

Anyway, it’s getting on toward bedtime, and that list is weird and disjoint, so I’m going to leave it for now. Maybe just start here: Learn to recognize the feeling that you don’t want to roll a test because it’s not cheap enough to do so. Think about what could be changed — your code or others — that might ease that pain.

Five Underplayed Premises Of TDD

Five Underplayed Premises Of Test-Driven Development (Transcript)

Hey, it’s GeePaw! I’m here to tell you today about five underplayed premises of Test-Driven Development. These premises form the kind of fundament under which almost all TDD proceeds. And when I say that I’m a TDDer, I almost always mean I am operating inside the little ring formed by these five test-driven development premises. Let’s check them out.

We’re In This For The Money

The first premise of TDD is what we call the money premise, and it goes like this. We’re in this for the money. I know, that seems like a strange thing to say, right? Well, what does it mean? We make money in software development through one means exactly, and just the one. And that is shipping more value faster. That’s how we do it. That’s where the money is in software.

We TDD because test-driven development is the best way we’ve devised so far to actually do that. TDD is about more value faster, and it’s not about anything else. Well, as soon as I say that, of course, I have to talk about some of the things it’s not about because they can be really confusing. And there’s a lot of misinformation out there on the internet.

The first thing TDD is not about is this. TDD is not about good citizenship. You are not immoral if you don’t TDD. You’re not not a good looker forwarder or a protector of the future. It’s not about citizenship. TDD isn’t even about raising the quality of your code. Now TDD very often does increase the quality of teams that aren’t doing it to begin with, but that’s actually neither here nor there, because TDD isn’t about that. TDD is about more value faster.

The other thing it’s not about? It’s not about art and it’s not about craftsmanship. It’s not about the excellence of our high standards. The reason we test drive is by test driving, we go faster. We put more value into the field faster than if we didn’t do TDD. And that is the money premise. We’re in this for the money.

We Happily Rely On Individual Judgment

The second premise of TDD is the judgment premise, and it says we rely every day, all the time, on individual human judgment. You know, there’s a lot of folks out there who might try to convince you that test-driven development is a kind of an algorithm for coding. You just follow these steps in this order and you will get TDD.

It’s not an algorithm for coding because there isn’t an algorithm for coding. All of those steps and pieces and parts of TDD, in order to choose which one to apply when, I am constantly making individual human judgments. And just for the record, we’re pretty happy that there isn’t an algorithm for turning human words into code. That’s what we do for a living, right? We turn human words into actual running programs.

The day after there becomes an algorithm for doing that, me and you and everybody in this trade is going to be looking for jobs on the night shift at 7-Eleven, and a week after that we’re all going to be hiding in the rubble from Skynet.

So it’s actually a good thing that there’s no algorithm for code. But this premise tries to highlight the extent to which you are going to be required– if you’re doing TDD– you’re going to be required to make active, individualized judgments. The judgment premise says we are absolutely, routinely, every day, all the time happily reliant on individual humans using their individual best judgment to guide us through the process.

Internal Quality And Productivity Correlate Directly

Next up we have the correlation premise. The correlation premise says that internal quality and productivity are correlated. They go up together and they go down together. They have a direct relationship. Now to understand that, you need to know a little bit about what I mean when I say internal quality. Internal versus external.

External quality is anything a user of your system could tell you about that system. Is it slow or is it fast? Does it handle all the cases correctly or only some of them? Does it stay up and running the whole time no matter what? These are all characteristics of external quality.

Internal quality, on the other hand, is things that you could only tell by studying the code. Internal quality is stuff like is the code scannable, readable, the factors that make it easy to change. Is it well-factored, broken into chunks that we can manage and change independently of each other? Is it well-tested? These are the sorts of things that go into making internal quality.

To circle back around again, you can trade external quality for productivity. In other words, if you don’t care that the screen runs kind of slow in this particular part of the program, then I can get done faster, which means I can produce more because I can spend more time on other things. And so on and so forth.

Internal quality doesn’t work that way. Why? Because the third most important factor in my daily output– given that the first two factors are one, how skilled am I, and two, how hard is the domain– the third most important factor, where do I start? And where do I start incorporates all those things we call internal quality.

So the correlation premise is very clear. It says that internal quality and productivity are correlated. They go up together and, sadly, they go down together. And that is the premise.

Test A Chain By Testing Its Links

The fourth premise is the premise we call the chaining premise, and it goes like this. The way to test a chain is to test each individual link in that chain. How does that work? Well, the idea is this. Our programs are always built out of smaller pieces and mid-sized pieces and then larger pieces. And there’s a chain. We call it a dependency chain.

Well, the chain premise is telling us that we write tests such that mostly what they concentrate on is individual connections in that dependency chain. So if I have a bunch of things, A, B, C, D, and E, where A calls B, B calls C, and so on, what I focus on first is how to write tests that can prove that A works assuming B works, and B works assuming C works, and so on and so forth until I’ve covered the entire chain. When I do that, I get the cheapest test that will cover the most ground for me. And that is the chain premise.

Tests & Testability Help Steer Design

Last but by no means least, we have the steering premise. The steering premise says that when we steer the development of our project all the way through, it tests and testability are first class participants in that process. So when we build new software, there are lots of different factors that we’re constantly balancing as we build it. On the one hand, you have things like the market and our expectations about where the market is going to go. And on the other hand, you have things about the actual technical capabilities of the particular platform that we’re running on.

| the middle we have things like that the capabilities of our geeks, things like the capability of our interpreters of the market to actually express those needs to us effectively, and so on and so forth. A lot of factors in play when we build new software.

The steerability principle says we use tests and testability as one of those factors all the way through development, just as we do the other factors. In other words, we are constantly considering questions of how am I going to test this, and how have I tested it so far? All the way through, from the very first line of code we write to the very last line of code that we write. And that is the steerability premise.

Why Underplayed?

So we have these five premises, right? The money premise, the judgment premise, the correlation premise, the chaining premise, and the steering premise. Why did I call them underplayed premises at the beginning? Well, there’s a reason for that. It’s because when you’re outside TDD, those premises are arguable, debatable, wranglable at length. And in fact, we all can do it.

But inside TDD, they’re almost invisible to us. They’re the air we breathe. So when you go out there on the internet and you start studying TDD, you know, you’re studying people who have already stood inside those five premises. As a result, they hardly see them anymore. That means they don’t pay a lot of attention to explaining to them.

So when you get out there and you start looking into TDD, yes, by all means, pay attention to the techniques used by the people who are inside the ring of those premises. But remember, the premises are really what are binding us into this test-driven approach in the first place.

So now you’ve heard the five underplayed premises of test driven development. And I hope as you go out there on the net, you’ll bear them in mind.

I’m GeePaw, And I’m done. Thanks.

Underplayed: The Steering Premise In Depth

time, finally, for the steering premise, from the five underplayed TDD premises.

the steering premise says “tests & testability help steer design & development”. what we’re saying here is that tests are first-class citizens in the mob of factors that shape our system, with a voice that counts, all the way through development.

think of the factors we take in to account when we make software. they range all over, from market considerations, to platform, from our geek skillset to our tech stack. we operate within this mob of factors. we decide what to write, when to write, how to write. and the factors contribute their voices all along. the steering premise says that tests & testability are as central and important as each of the others.

it might be rhetorical to class this with the underplayed. after all, the movement is test-DRIVEN-development. a thousand heartaches ago, when TDD first hit these mean streets, the most shocking thing about it was that we wrote a test *before* we wrote the code to pass it. it was often called tFd. (no, not *that* F: “first”. “test-FIRST-development”. get yer head outta the gutter.)

writing a test before you change the code to pass that test, this is the steering premise at its most micro most rigorous level. and when geeks like me tried to adhere to this most rigorous advice, we discovered several remarkable things.

you can’t write tests before code very many times without discovering that some tests are dead easy to write, and some tests are hard as hell. and you think it’s cuz you’re a noob and you’re full of vigor, so you keep at it. and you eventually make two discoveries or give up TDD.

  • discovery #1: hard testing problems are patterned. that is, they resemble each other in detectable and describable ways. “the database needs loading.” “the threads have to context-switch.” “the pixels can’t be interrogated.” “the answer is stochastic.”
  • discovery #2: hard testing problems can very often be turned into easy testing problems by rearranging where code lives and how it’s called.

and once we’ve made these discoveries, the steering premise goes from its micro scope to the macro. why? well. we’re not *stupid*. if we can rearrange things to make hard tests easy tests, why not arrange things more nearly towards easy in the first place?

a real, if stupid, example. spoze we get a filename, and we have to get a hundred comma-delimited values out of that way of psuedo-code, i’ll show you the paragraphing comments the noob types in.

now if we put all this in one method. to test that method, we’re going to have to litter the arena with sample files. (normally, we stash all these in some /test/ hierarchy). the parse might need a dozen examples to satisfy us that it works, so that’s a dozen files. so we make all the files, and we write one, then we copy/paste/edit the others. and as the code develops we do more and more of this, and we cook up an ingenious naming scheme for them.

and because you can’t “de-duplicate” files, if we ever change the comma delimiting to tab delimiting — see “why developers hate everyone” — we go back and manually edit it all. the point is, this is *hard*. it’s not a cheap set of tests to scan/read/write/run/change/diagnose.

a slight rearrangement, the slightest, will greatly ease our pain. put the first two lines, whose effect is to go from filename to stream, in one method. put the third line in another method.

testing the first method is still tedious, cuz we have to have some files to test it on. but a) there are fewer cases, having to do with file-nature, not file-content. b) in real-life we’d prolly elide the test altogether, as it’s a sequential series of pure library calls.

testing the second method can now be done by passing in streams. streams are far easier to make and organize and edit and de-duplicate than files, because they can be made with one line of code and a string constant. (as i say, this case is dumb, the answer is easy, the rearrangement trivial. but this is real. i’ve been in shops all around the world that use files in tests when they need not do so.)

do you see what we did? we changed our *design* to make it *easier* to *test*. we steered.

“i’ll be damned,” you say, “so we did. we changed our design to make it easier to test. damnedest thing i ever saw.” well. you’re kinda impressionable. but the truth is, it *is* the damnedest thing, for a slew of reasons.

there are lots of these cases, where we can rearrange things to make testing cheap. they form the set of patterns that experienced TDD’ers carry around with them.some of the rearrangements are easy to learn, others harder. the answers don’t usually just drop in effortlessly, but require custom fit, so they’re true patterns.

they do nearly always fit a meta-pattern: “eliminate or mitigate awkward collaborations”. that’s a muse for another day.

our case here is low-level code. to take the steering premise to its full extent, tho, we need to grow both “low-level” and “code” in our vision.”low-level” suggests that this kind of thing only applies to, idunno, casting about for a phrase, “leaf nodes”: close to the bottom code chunks, as opposed to “big picture abstractions”. this would be mistaking one case for the range of cases.

a lot of folks layer their thinking — i don’t, and am generally opposed to too much of this, but i know what they mean — they think there’s “code”, “design”, “architecture”, in ascending levels of abstraction.

the more complete version of the steering premise aims at the *entire* pyramid of abstractions. it’s not just about the breakout of functions, it’s about functions, objects, layers, subsystems, programs, systems, and apps. that is, refactoring — that’s what we did in that simple code case — can be done at any level of the abstraction hierarchy. in fact, it’s often easier to make tests cheap well above the level of leaf nodes.

we grow our vision of steering vertically by saying that we steer *all* the levels, from highest to lowest, by taking tests & testability seriously as factors.

and what about “code”? does the steering premise only apply when we’re looking at structured UTF-8 text that’s to be executed by a computer? no. we can incorporate tests & testability as factors not just in the code, but all through the process of shipping more value faster. making tests & testability first-class factors let’s us move out from code-per-se to coding-per-se, and from there to shipping-per-se.

consider that TDD *enables* continuous integration (CI) and that in turn *enables* continuous deployment (CD), and that has huge impact on our customers and our market. i think of this as expanding steering horizontally. the steering premise reaches far beyond the scope of “arranging structured text to be exceuted by von neumann architecture devices”. it doesn’t just change how we see code, it changes how we see the whole activity-set.

the steering premise says we treat tests & testability as first-class citizens throughout the entire range of the software development game. it is at the very heart of TDD, and everything TDD’ers do depends on and draws from it.

Underplayed: The Chain Premise In Depth

today, let’s talk a little about the chaining premise, from five underplayed tdd premises.

the chaining premise says “test a chain by testing its links”. like the other premises, it’s easy to make it pithy, but it has vast ramifications about when we’re doing TDD.

when we talked about the money premise, i gave a long, likely partial, list of ways TDD supports that premise. did you notice i never mentioned the customer? TDD is for developers. the people it helps the most are the geeks who do it. (oh, don’t worry. TDD works for developers, but remember, developers work for managers, and managers work for customers. it all works out in the end, i promise.)

for the TDD tests to make us ship more value faster, they have to be “cheap”. i’ll define that vague word using some other vague words. “cheap” means “easy-to”. TDD tests want to be easy to … scan, read, write, change, run, and diagnose. that means — remember the thinking knee — they have to be small.

consider an app. it consists of, in decreasing order of size: systems, subsystems, programs, layers, packages, objects, and functions. all arranged in an intricate directed dependency graph. A calls B calls C calls D calls, well, it’s dependencies all the way down. some apps are *huge*. most apps are just “bigger than a breadbox”. a small number of apps are small.

how are we going to write small tests against large apps? this is where the chaining premise steps in.

the chaining premise says 1) we can test by testing only *parts* of the app at one time, and 2) the natural parts are the arrows in that A->B->C->D dependency chain, and 3) the cheapest tests work pair-by-pair along the chain. a possible confusion: it’s not *really* a chain. dependency graphs are directed graphs, and any given part might depend on more than one other part. normally, A willl depend on B *and* C, each of which have their own dependencies.

call a “unit under test” one of these letters, A. say that A depends on, uses, imports, B, C, and D. we’ll say that A “collaborates” with B, C, and D, and that they are A’s “collaborators”. to test that A does what the geek thinks it does — the heart of TDD — we hook A up to its collaborators, give it some commands, and poke around a little to see if A did the thing we thought it did.

okay, but wait. isn’t it true that if I hook A up to a collaborator, i’m hooking it up to all of that collaborator’s dependencies, and so and so on all the way down?

fine. we’ll trick the A. we’ll give it a thing that it thinks is its collaborator, but is really just a *simulator*!! MY GOD, WE’RE GONNA BE RICH!!! except. wait. if you can write a simulator for a collaborator that does everything that collaborator does in every possible circumstance, aren’t you going to wind up with just as many dependencies?

are we stumped? is the chain premise now dead? sure feels like a lot of damned thinking for nothing. ya know, mama dint raise no thinkers.

so ya got these tests and you need — oh. that’s it, that’s the problem, we’re thinking of all these tests. but we only actually write, or for that matter scan/read/run/change/diagnose *one* test at a time. we don’t need a big heavy simulator, we need a bunch of really stupid fakes. really stupid fakes are much easier than simulators. if we can find ways to do this quickly and easily, we’re back on track.

so this chain premise has us testing A by testing one path through one function at a time, supplying really stupid fakes instead of real collaborators.

a contrived but demonstrative example. in order to decide what to do with a notification, say that A needs to know whether it’s business hours or not. it uses an Hours for this. it says “if(hours.isCurrentlyOpen()) … else …” we write one test for the if, and another one for the else. we don’t use a real Hours for either test. for the first one, we have an Hours that always says it’s currently open. for the other, the opposite.

notice that this trick *breaks* the dependency tree. A still depends on Hours, but the Hours it depends on doesn’t depend on anything else. this means i am testing just one piece of the chain, the A. i’m not testing its collaborators, i’m not testing the app, i’m testing A.

and what, after all, are we testing here? we are testing that A works the way we thought it did ASSUMING that its collaborators work the way we thought they did. a person might wonder how much that’s worth. how much is it worth to know that a given piece works assuming the pieces it talks to work? the answer is that it’s worth *way* more than the cost of doing it.

or, anyway, that’s what the chain premise says. “test a chain by testing each link”. if we can cheaply know for every A, B, C, … that it works if what it depends on works, i have gone a *fabulous* distance towards knowing that the whole thing works.

time for some provisos and caveats.

  1. we get to choose when and where we break the dependency chain for testing. there are lots of places where it isn’t worth doing. one never writes a fake String or Integer, or instance. using real ones works just fine.
  2. there are lots of ways to do faking, of various weights and sophistications. you will find mucho de argumento about which one represents the One True Way[tm]. know this, tho, any turing-complete environment can do this.
  3. practice practice practice. doing this, seeing this, arranging things so it works, this is not an instantly learned skill. i’d hazard most of your TDD studs spent 2-5 *years* getting good at it. we can speed you along a little, but it’s never going to be instant.
  4. MOST IMPORTANTLY: this really only works *cheaply* if we build it that way. there are lots of dependency trees that would make it very costly. this is the steering premise, which we’ll do in depth real soon now.

the chain premise says “test a chain by testing each link”. we test a link by pulling it *out* of the chain, testing it in various degrees of isolation. the knack and difficulty of TDD is in how and when we do this.

Underplayed: The Judgment Premise In Depth

the judgment premise is one of five underplayed tdd premises.

the judgment premise is simple to word and vast in its extent. it says, “tdd relies absolutely on individual humans using their human judgment.”

you might ask yourself, “what *doesn’t* rely on human judgment?” but there are lots and lots of activities that are entirely mechanical, judgment-less, and geekery is full of them. we *work* with judgment-less systems every day. we call them “computers”.there’s merit to “algorithmizing”, that is, making computable sequences that can replace ones that require human intervention, *much* merit. but sometimes our zeal for this activity gets the better of us.

the modern synthesis in coding, what i normally call TDD, but meaning TDD in the broadest sense, has at its core four activities: writing tests, making them pass, change-optimizing the code, and pushing the result to head. (i shorthand them to red, green, gold, push, and i trust you’ll know what i mean going forward.) in its simplest expression, we can see these as a simple cyclical sequence: red, then green, then gold, then push, then start over. we can elaborate on this — i’ve done so and many others have, too — to create a fancy flow-chart thing from it.

the judgment premise says that every one of the bubbles on this psuedo-algorithm is fraught with the requirement that humans use non-computable means to control it — to refine it, to transition around it, to actually *do* TDD.

the originators knew this, of course, and the serious practitioners know it, too. that’s why i call this a premise: it is baked-in all the way through. it’s an *underplayed* premise, in my view. we don’t make a big enough deal about it. that’s because when you’re inside TDD, your confidence in the premises is complete. they’re the air we breathe, and like that air, largely invisible but essential to our success.

the judgment premise, like the rest, is an effort to make this air visible to us. let’s look at those four core activities, with an eye towards spotting the non-computable human judgments they involve.

getting to red involves numerous judgments. first, of course, there’s the simple question of whether or not the code i’m about to put in calls for a test at all.i don’t test a function called ‘getX()’ that returns a field called ‘x’s value. i don’t know of anyone who does. my judgment is that such a test wouldn’t pay for itself. note: that’s not to say i never fuck up the implementation of ‘getX()’. it’s to say that when i do that, occasionally, i discover it so easily, usually the very first run, that writing a test for it would be a waste of time. (the money premise rears its mammonist head.)

the “no test for getX()” judgment isn’t a no-brainer, cuz no-brainers aren’t judgments, but it’s a tiny-brainer. as we proceed at getting to red, tho, some of these similar “no test for” decisions become quite a bit more brainy.and it isn’t just whether or not to test it’s very often *what* to test. recently, i was laying out some UI in javafx, whose layout strategy is fuzzy AND buggy. i did most of my basic testing using a skeletal disconnected “ui-only” app with a GAK (geek at keyboard) approach.

i came to understand that if i wanted perfection, i needed to do the layout math “manually”. that is, give me the width & height of client area, and let me compute and assert the layout from that. and *that* computation, i tested with a bunch of automated cases. there are still no tests for the actual rendering: my judgment has been validated visually.

a third judgment in getting to red: is this the right time to write that test? an extremely common move in the TDD game: sketch out a test and defer it. sometimes i write the test and mark it ignored, sometimes i jot a note, sometimes i count on my leaky head to remember.

there are others i’m not thinking of just now, i feel sure. but let’s move on to green. once i have a failed test i like, i get it to pass. here, the use of judgment seems so gigantic and obvious it hardly needs enumeration.

which methods change? which objects get made? how do they interact? do i need more data, less? do i replace/extend this existing method, or do i need a whole new thing? all of these are judgment calls. they’re based on heuristics, on intuition, on sketches, on conversations, on habits and even mood. they are not computable.

when that test and all the other tests are green, we then hone in on the design of the code. refactoring — getting to gold — is the business of optimizing our code for changeability. as with getting to green, i feel sure i don’t need to point out the level of delicate decision-making that goes on during the gold step. the fact is, change-optimizing code engages my judgment more fully than any other work i do.

that maximized-judgment thing, btw, is why your old master geek coaches seem to enjoy refactoring more than any other part of the work. it feels so *good* to use all of oneself in a single activity, logic, experience, intuition, anticipation. if companies were bright enough to pay us just to refactor code, i and many of my geek-coaching confreres would do nothing but that. it feels so much like the maximum expression of our long lives in geekery.

pushing code, the fourth big blob on our psuedo-flowchart, is sometimes elided from the TDD skeleton, but it’s tremendously important. much of the benefit of TDD in teams is captured because of continuous integration.

sometimes after green i push, sometimes i don’t. i decide on the spot, a judgment. further, sometimes i’m midstream in getting to red or green or gold, and i “anti-push”. i revert. that’s a decision i usually make through some abstruse mixture of elapsed time & anxiety. (aside: we need to teach more folks about the huge value of restarting. i revert work-in-progress all the time, instead of always and only laboring on. i learned this only because i was practicing CI in very small cycles.)

the judgment premise is about the humanness of the software development enterprise, even that part of it, TDD, that is closest to the metal.

i’m not hating here on the many psuedo-algorithms we make. as i say, i’ve done it myself. i understand the impulse, and they can be very helpful (briefly) to folks at the beginning of the climb. but psuedo-flowcharts are training wheels. riding a bike with training wheels, however important as a short phase, isn’t being a cyclist. if you’re going to engage with the modern coding synthesis, with TDD, refactoring, CI, CD, and so on, you’re going to be constantly using the non-mechanical parts of you.

the judgment premise: in TDD we are absolutely, continuously, ineluctably, and *happily* entirely reliant on individual huamsn using their best individual judgment.

Me And Programming Go Way Back

i became a professional programmer when i was 20, not-quite 38 years ago.

bob martin’s back-of-the-envelope estimate of the doubling rate for programmers is that it’s been about 5 years for at least 3 decades. that means i have more time in this trade than more than 99% of the other programmers in the world today.

what does that mean about me? idunno, really. a bunch of things.

  • it means i’m a bitter old man, of course. even if we put a pleasant face on it, surely the least one could say is that i am skeptical by default stance.
  • it means i’ve failed more often to ship on time and under budget than any 20 random geeks you know. also succeeded more than them.
  • it means i’ve climbed one helluva lotta mount stupids. and by induction, that i’m fighting my way up one now.
  • it means i’ve written just about every kind of software there is to write, tho of course the spectral analysis would show lots of imbalances.
  • it means i read almost no books on “mere coding” these days. not that i *didn’t*, of course. i’ve read i’m sure hundreds of them, and kept some of them in the bathroom for years at a time.

but mostly, i think, what it means is that i have no patience for over-simplification of what programmers do or should do.

programmers are translators. we translate from the sense made by human language into the sense made by computer language.

our work is fundamentally sociotechnical, inconceivable without the strange fractal border between wildly complex human interaction and rigorously simple mathematical formalism. it requires at different times tremendous sensitivity and crude indifference, patient persistence and an openness to lightning, a taste for the solo and the collaborative, tremendous balanced love for lofty abstraction and gritty detail.

programming for a living is infinitely delightful and exasperating, and i have spent nigh on forty years living the life of the mind in the lap of luxury.

on my good days, i recommend it.

Underplayed: The Correlation Premise In Depth

five underplayed premises of TDD includes the correlation premise.

the correlation premise says “internal quality and productivity are directly correlated”. confusions and misunderstandings around this premise abound furiously, so it’s worth taking some time and working it out in detail.

when we say internal quality (IQ) and productivity are directly correlated, we mean that they go up together and they, sadly, go down together. their trend lines are inextricably linked. the first thing we have to do to parse this is make sense of internal (IQ) vs external (EQ) qualities, because a lot of the confusion starts right there.

external quality (EQ) includes any attribute of a program that a user can experience. is it correct? EQ. is it useful? EQ. is it fast, pretty, graceful, stable, complete? all of this is EQ.

on the other hand, internal quality (IQ) includes any attribute of a program that only a coder can experience. IQ is visible only in the source, and implicitly only by someone who can grasp and manipulate that source. is it minimal? IQ. is it tested? IQ. is it well-named, well-factored or jointed, easy to read? these are all IQ.

the correlation premise says that you can’t trade away IQ to get higher productivity. and there’s the first source of confusion: because you *can* trade EQ for higher productivity.

this is obviously the case: under nearly all circumstances, it takes more time to make a program faster, or prettier, or to include unusual corner cases, or to be exact instead of approximate. if you don’t care about rare corner cases, that’s less thinking i have to do and less code i have to write. that’s less *time*, and i can spend that time usefully on other, ideally more important value. so EQ trades pretty easily for productivity.

IQ doesn’t work that way. the reason is because software development productivity is at its base exactly about *humans* *changing* *code* *correctly*.

if we sat down and wrote what economists call the production function for software development, we’d get some massive complex polynomial. plug in the variables, give that crank a turn, and get out of it how much production you get in a day or week or year. the three largest terms of that polynomial are these: the skill of the code-changer, the fundamental complexity of the domain, and the “changeability” of the code that has to be changed.

the code-changers’s skill can be improved, but it requires experiences, and the massive demand for software means that every year there are fewer and fewer experienced code-changers available proportionally to serve that demand.

the fundamental domain complexity can also change, usually in the form of a dramatic paradigm shift. sadly, this is both wildly unpredictable and almost entirely outside of our ready control.

what about the changeability of the starting code? now, my friends, we are cooking with gas. because there’s another name for “changeability of the starting code”. it’s called “internal quality”. all of the IQ things are all of the design principles from the last five decades of geekery are all of the TDD/refactoring things, and every single one of them is about making life easier for *humans* *changing* *code* *correctly*.

internal quality can’t be traded for productivity because it’s the most malleable of the top three terms in the software development production function.

there are two more confusions that hamper the newcomer’s thinking about all this, and our trade is swamped with newcomers, so we need to call them out.

the first is the conflation of internal quality with the word “clean”, its variants and cognates, and the fifth column of overtones and metaphors that come with it. i strongly oppose this usage, so much so that when someone speaks it i am often almost completely blocked from carrying the discourse further until it’s resolved. when i make function-preserving alterations to code, when i refactor, in other words, i am directly and simply maintaining and increasing the internal quality term of my development production function.

when i do it pre-emptively and habitually, i’m doing it from the mature recognition that there are patterns of high internal quality, and that following those patterns makes me more productive,

when i do it immediately prior to touching some code to implement a new story, i am doing it because implementing that story will be easier — faster — in a high-IQ codebase, and refactoring is easier — faster — than interpolating new function in low-IQ code.

i am not “cleaning”, because that word is deep-laden with overtones and metaphors that don’t reflect what, how, or why i am refactoring. most notably, it comes with ideas of optionality and morality, neither of which are present in my idea of refactoring. i won’t belabor the moral thing now, except to point out to non-native speakers of english that whole generations of us were raised on the adage “cleanliness is next to godliness”, and for those outside of the judeo-christian tradition, sloth is one of the seven deadly sins.

as for optionality, it’s really a form of delayability (“delay forever”), and that’s the third big confusion around the correlation premise. here we go. the argument is made, from the “clean” stance but also from entirely separate impulses, that we do have to refactor for high internal quality, but we don’t have to do it *now*. assuming a perfect golden codebase, almost any new value i add will make it less golden. (this is a complicated thing, and i don’t want to explain it now, so i’m asking you to trust me on this.)

the “eventually” argument admits that sooner or later we must “re-golden” the base, but that we can delay doing so, because the cost of re-goldening, refactoring, need not be borne until we have to add more value to the code we just added. this is an argument about how soon low-IQ kicks in during the production function. if it isn’t immediate, we can stall, right? and there’s the confusion. you see, it *is* immediate. because the lowering of IQ affects the production function so quickly, stalling just isn’t a viable strategy.

you doubt me. that’s okay, join the club. (no, *back* of the line, please. we’ll announce your number when it’s your turn.) let’s take an easy example.

how soon does a weak name for a variable, method, or class start affecting your production function? refactorers are obsessed with names and naming.

well here’s the thing, in programming, the only reason to introduce a variable, method, or class, and to name it, is so you can begin using it. whatever that cost is, you start paying it the very second time you enter it. and remember the thinking knee, and the clever workaround called “chunking”? names and their attendant metaphors *dramatically* affect our ability to chunk.

if you’re old-school enough to have ever had to change computer-generated code, like yacc output for instance, you’ll know something else: it doesn’t take very many weak names to render code virtually un-think-aboutable. remember that internal quality is all about supporting humans changing code correctly? anything un-think-aboutable doesn’t just slow down the changing of code. it stops it dead.

so there ya go. the correlation premise says “internal quality and productivity are directly correlated.” you can’t trade one to get more of the other. they go up together, and, sadly, they go down together.

How TDD Ships More Value Faster: The Money Premise In Depth

we talked about five underplayed tdd premises before, here’s a video & transcript.  over the next couple of weeks, i’d like to take a little time and go over each of them in more depth. today, let’s start with the money premise.

the money premise says: “we’re in this for the money.” TDD is fundamentally about making money. in software, we make make money by Shipping More Value Faster (SMVF).

i’ve been doing, teaching, writing, arguing, and coaching TDD for almost 20 years. in that time i’ve heard one helluva lot of reasons why TDD could never get us to SMVF. i won’t enumerate them all, but instead want to pick on just one, cuz it’s a great intro to the money premise.

“on a great day, i might write 1000 lines of production code. TDD suggests that, instead, i write ~500 lines of production code, and ~500 lines of test code. TDD won’t make SMVF cuz i won’t write as much V in a day. QED.” (yes, yes, never mind that LOC is not a great measure of value. it’ll do for the moment as a temporary stand-in. let’s just take this argument as legitimate in spirit.)

this argument is based in a notion that the thing that takes up one’s great day is *writing* code. that the limit of 1000 lines, in other words, is a limit because that’s all the entering code one could do in a day. i would contend that this is not so. we say, “typing is not the bottleneck.” what that means is that the hard part isn’t writing code, it’s knowing what code to write. i am not stuck at a 1000 lines in a (great) day because i can only *enter* that many lines, but because i can only *think* of that many lines to enter.

the money premise says we’re in this for the money, for shipping more value faster, and it’s explicitly claiming that TDD helps my *think* of lines to enter, and it helps me do that dramatically, more than enough to make up for its cost.

so how does TDD fulfill this promise? let me count the ways…

1) TDD narrows mental scope, focusing our thoughts laser-like on very small problems in very limited contexts. each test we roll is quite small, micro, even. its solution is just one enhancement — one little value-add — and we’d expect it to be solved in a matter of minutes. it normally involves far fewer than half-a-dozen mental entities, well within the thinking knee. the tests are written interactively, a little test, a little code, a little test, a little code. tho the final scope may be quite substantial, the local context is BY INTENT AND DESIGN a very small context.

2) TDD tests explicitly capture the intent of the code they exercise and express it an alternative form, cushioning and supporting exactly the hardest part of coding: understanding the why of a chunk of code. in multi-coder environments, this is an obvious benefit, but anyone who’s been coding for longer than a year will certainly recognize the experience of not knowing why even your *own* code was written the way it is. TDD tests are a kind of analogue to comments — except unlike comments, they are far less likely to mislead (or lie), because they are *executable* comments.

3) TDD tests provide highly focused debug sessions, greatly reducing one of the most time-consuming tasks coders undertake in their daily lives. when we see a bug in TDD, we write tests, often large-scale, evolving steadily smaller as we investigate, focusing and refining our attention and our tests as we go, until our list of possible culprits is narrow enough to truly find the flaw. we are typically working with *twin* tests, one at larger scale, one at smaller. we’re looking for both tests to be consistently red, proving they’re testing the same thing. the tests provide us with ready automated repeatability, and that makes our debugs far less painful.

4) the interactivity of TDD testing builds profluence, steadily building confidence & authority as we proceed. when HBR did it’s massive diary-study of worker motivation, a key (new to many) insight was that workers like very much to feel *progress* as they work. the steady accretion of microtests during the work day provides a concrete (executable) progress meter. profluence — “forward-flowing” — is a major force in our ability to function at our highest levels of performance, and TDD provides a form of it.

5) that same accretion of tests adds rhythm to the coding process, and the steady pulse of tension & release steadies and soothes the key actor in the act of coding: the human mind. writing a test, what TDD’ers call “getting to red”, creates a tension in the writer. passing that test (“getting to green”), releases that tension in a powerful rewarding way. refactoring the result, or “getting to gold”, extends the release & builds juice to start the next beat. the steady beat of TDD greatly enhances our ability to code all day.

6) TDD flags value-toggling, where adding value B breaks value A, and restoring value A breaks value B, a common situation once programs get large enough to not fit in the head at one time. the TDD tests persist. what we tested yesterday is still being tested today. it can’t *quietly* fail. TDD doesn’t *solve* such problems, but often enough the problem isn’t solving them, it’s noticing them in the first place. TDD does the noticing for you.

7) TDD quickly reveals the smple human-computer language mismatches, the cases where u meant X and the computer heard Y, the source of the overwhelming majority of shipping defects. you know the kind of mismatch i’m talking about: you meant to say “if(a!=b)” but you mumbled “if(a==b)”. computers have no damned sense, and they can only do exactly what you *said*, not what you meant. off-by-one, inverted logic, parameter misordering, unreached code, these form surely >80% of bugs. all of these are revealed much sooner when we have microtests exercising our code. that translates directly into far greater value being shipped.

8) TDD is a remarkable proxy for the abstract principles we call “good design”, because the attributes of good design are the very attributes required by the TDD approach. loose coupling, dependency inversion, smaller units, interface segregation, immutability, all of these ideas don’t just make designs better, they make testing *easier*. the natural outcome of efficient TDD is superior design. return to the beginning. the money premise says that TDD is fundamentally *about* Shipping More Value Faster, which is how we make money out of software.

the hard part of coding isn’t entering the code, it’s knowing what to enter, knowing it works, knowing what impact it has, knowing why to enter it. TDD works money magic by directly easing these varied tasks of knowing. The Money Premise: We’re in TDD for the money.