TDD Pro-Tip: Wrap framework collection classes early and often. The Primitive Obsession smell is ubiquitous around collections, and not seeing and fixing it can greatly hamper productivity.
In HollywoodSimulator, a common act of every Person is to spend time manipulating their little black book — the collection of the contacts they have. The first-pass way to represent this would be to use a framework collection class. In Java/Kotlin, that’s List<Person>
.
List<Person>
is not a primitive, far from it, in Java it’s actually quite a baggy monster. (It gets worse, in JavaFx it’s ObservableList<Person>
, which is a bizarre half-list half-not thing.) But it’s common, like String, & we tend to see it as primitive, & reach for it quickly.
So Agent, a kind of Person, has a member field: private val littleBlackBook of type List<Person>
. This is the kind of thing I mean when I say "framework collection class".
A complicated algorithm, called sixDegreesOfKevinBacon(person:Person,...)
, is used to search not just my list, but the lists of various other Persons in my list. It’s parameterized w/the Person I’m tryna make contact with, and some other values that don’t bear thinking about.
What I am saying in this pro-tip, is that using List<Person>
directly, passing it around, manipulating it using Lists’s gigantic interface, is something we usually don’t want to do.
Instead, we want that field to be not a psuedo-primitive like List<>
but another homegrown class. Agent
would have a field: private val littleBlackBook: LittleBlackBook
.
Why? Why go to the trouble of defining LittleBlackBook, having it own its List internally, and delegating a variety of API’s directly to the List. Why not either use the List directly, or derive LittleBlackBook off of List?
First, because the interface to List<>
is insanely fat. It has 32 methods in java. (ObservableList<>
has more like 80.) This fatness has two consequences: it swamps signal with noise, and it reduces fake-ability. The signal is swamped not at the immediate level, but once we step back a little, to sixDegreesOfKevinBacon
and other complex algorithms. From a step or two away from the List, we quickly lose track of usages.
One gal writes code using list’s stream manipulators, map, flatmap, reduce, etc. Another gal uses old-school iteration. A third spices that up with some spliterators. We as refactorers will never see the duplication. Our code bloats and we lose control over it.
And the fake-ability issue? It might also be thought of as indirection-ability. All faking is indirection: instead of getting the X to do a thing, we get a FakeX to do it.
Every API you expose will eventually get used. The more of them there are, the more tightly bound you are. Imagine what happens when black books get too big for RAM and must be persisted? Blammo. You must implement that gigantic interface using a database and rocket science.
(This speaks to the hard-TDD position, btw: Any object you can’t readily fake you also can’t readily change implementation on. It’s SOLID’s ISP and DIP in action.)
Second, suppose you’re brilliant and you can see the duplication we mentioned just above. Where are you going to put it? You have three choices: in a base class of Agent (Person?), as an extra method on LittleBlackBook derived from List<E>
, or as a delegate BlackBookHelper.
The first two involve implementation inheritance (II). II has two problems: 1) because of the possibility of overriding, it means you always have to know a classes’s entire tree to know what it does. 2) you can’t test it by itself.
And the third? Well. If you do the third, you’re going to have to pass that helper the List anyway, either at constructor time or in every single call. Why not just put the List inside the helper in the first place, which is exactly what I’m proposing.
Recap: The scent is returning, exposing, or passing a framework collection class. The refactoring is to wrap that class and take perfect control over its API, its testability, its fake-ability, and its place in the class hierarchy.
This is not a "rule". There are in fact situations where the framework collection class works best just the way it is. TDD relies on the Judgment Premise, after all. This is a "leaning". Lean towards wrapping those classes early and often.
Have an oddly pleasant Wednesday!