Software Forestry 0x03: Overgrowth, Catalogued
Previously, we talked about What We Talk About When We Talk About Tech Debt, and that one of the things that makes that debt metaphor challenging is that it has expanded to encompass all manner of Overgrowth, not all of which fits that financial mental model.
From a Forestry perspective, not all the things that have been absorbed by “debt” are necessarily bad, and aren’t always avoidable. Taking a long-term, stewardship-focused view, there’s a bunch of stuff that’s more like emergent properties of a long-running project, as opposed to getting spendy with the credit card.
So, if not debt, what are we talking about when we talk about tech debt?
It’s easy to get over-excited about Lists of Things, but I got into computer science from the applied philosophy side, rather than the applied math side. I think there are maybe seven categories of “Overgrowth” that are different enough to make it useful to talk about them separately:
1. Actual Tech Debt.
Situations where an explicit decision to do something “not as good” to ship faster. There’s two broad subcategories here: using a hacky or unsustainable design to move faster, and cutting scope to hit a date.
In fairness, the original Martin Fowler post just talks about “cruft” in a broad sense, but generally speaking “Formal” (orthodox?) tech debt assumes a conscious choice to accept that debt.
This is the category where the debt analogy works the best. “I can’t buy this now with cash on hand, but I can take on more credit.” (Of course, this also includes “wait, what’s a variable rate?”)
In my experience, this is the least common species of Overgrowth, and the most straightforwardly self correcting. All development processes have some kind of “things to do next” list or backlog, regardless of the formal name. When making that decision to take on the debt, you put an item on the todo list to pay it off.
That list of cut features becomes the nucleus of the plan for the next major version, or update, or DLC. Sometimes, the schedule did you a favor, you realize it was a bad idea, and that cut feature debt gets written off instead of paid off.
The more internal or infrastructure–type items become those items you talk about with the phrase “we gotta do something about…”; the logging system, the metrics observability, that validation system, adding internationalization. Sometimes this isn’t a big formal effort, just a recognition that the next piece of work in that area is going to take a couple extra days to tidy up the mess we left last time.
Fundamentally, paying this off is a scheduling and planning problem, not a technical one. You had to have some kind of an idea about what the work was to make the decision to defer the work, so you can use that same understanding to find it a spot on the schedule.
That makes this the only category where you can actually pay it off. There’s a bounded amount of work you can plan around. If the work keeps getting deferred, or rescheduled, or kicked down the road, you need to stop and ask yourself if this is actually debt or a something asperational that went septic on you.
2. We made the right decision, but then things happened.
Sometimes you make the right decisions, don’t choose to take on any debt, and then things happen and the world imposes work on you anyway.
The classic example: Third party libraries move forward, the new version isn’t cleanly backwards compatible, and the version you’re using suddenly has a critical security flaw. This isn’t tech debt, you didn’t take out a loan! This is more like tech property taxes.
This is also a planning problem, but tricker, because it’s on someone else’s schedule. Unlike the tech debt above, this isn’t something you can pay down once. Those libraries or frameworks are going to keep getting updated, and you need to find a way to stay on top of them without making it a huge effort every time.
Of course, if they stop getting updated you don’t have an ongoing scheduling problem anymore, but you have the next category…
3. It seemed like a good idea at the time.
Sometimes you just guess wrong, and the rest of the world zigs instead of zags. You do your research, weigh the pros and cons, build what you think is the right thing, and then it’s suddenly a few years later and your CEO is asking why your best-in-class data rich web UI console won’t load on his new iPad, and you have to tell him it’s because it was written in Flash.
You can’t always guess right, and sometimes you’re left with something unsupported and with no future. This is very common; there’s a whole lot of systems out there that went all-in on XML-RPC, or RMI, or GWT, or Angular 1, or Delphi, or ColdFusion, or something else that looked like it was going to be the future right up until it wasn’t.
Personally, I find this to be the most irritating. Like Han Solo would say, it’s not your fault! This was all fine, and then someone you never met makes a strategic decision, and now you have to decide how or if you’re going to replace the discontinued tech. It’s really easy to get into a “if it ain’t broke don’t fix it” headspace, right up until you grind to a halt because you can’t hire anyone who knows how to add a new screen to the app anymore. This is when you start using phrases like “modernization effort”.
4. We did the best we could but there are better options now.
There’s a lot more stuff available than there used to be, and so sometimes you roll onto a new project and discover a home-brew ORM, or a hand-rolled messaging queue, or a strange pattern, and you stop and realize that oh wait, this was written before “the thing I would use” existed. (My favorite example of this is when you find a class full of static final constants in an old Java codebase and realize this was from before Java 5 added enums.)
A lot of the time, the custom, hand-rolled thing isn’t necessarily “worse” than some more recent library or framework, but you have to have some serious conversations about where to spend your time; if something isn’t your core business and has become a commodity, it’s probably not worth pouring more effort in to maintaining your custom version. Everyone wants to build the framework, but no one really wants to maintain the framework. Is our custom JSON serializer really still worth putting effort into?
Like the previous, it’s probably time to take a deep breath and talk about re-designing; but unlike the previous, the persion who designed the current version is probably still on the payroll. This usually isn’t a technical problem so much as it is a grief management one.
5. We solved a different problem.
Things change. You built the right thing at the time, but now we got new customers, shifted markets, increased scale, maybe the feds passed a law. The business requirements have changed. Yesterday, this was the right thing, and now it isn’t.
For example: Maybe you had a perfectly functional app to sell mp3 files to customers to download and play on their laptops, and now you have to retrofit that into a subscription-based music streaming platform for smartphones.
This is a good problem to have! But you still gotta find a way to re-landscape that forest.
6. Context Drift.
There’s a pithy line that Legacy Code is “code without tests,” but I think that’s only part of the problem. Legacy code is code without continuity of philosophy. Why was it built this way? There’s no one left who knows! A system gets built in a certain context, and as time passes that context changes, and the further away we get from the original context, the more overgrown and weedy the system appears to become. Tests—good tests—are one way to preserve context, but not the only way.
A whole lot of what’s called “cruft” is here, because It’s harder to read code than to write it.. A lot of that “cruft” is congealed knowledge. That weird custom string utility thats only used the one place? Sure, maybe someone didn’t understand the standard library, or maybe you don’t know about the weekend the client API started handing back malformed data and they wouldn’t fix it—and even worse, this still happens at unpredictable times.
This is both the easiest and least glamorous to treat, because the trick here is documentation. Don’t just document what the code does, document why the code does what it does, why it was built this way. A very small amount of effort while something is being planted goes a long way towards making sure the context is preserved. As Henry Jones Sr, says, you write it down so you don’t have to remember.
To put all that another way: Documentation debt is still tech debt.
7. Not debt, just an old mistake.
The one no one likes to talk about. For whatever reason, someone didn’t do A-quality work. This isn’t necessarily because they were incompetent or careless, sometimes shit happens, you know? This the flip side of the original Tech Debt category; it wasn’t on purpose, but sometimes people are in a hurry, or need to leave early, or just can’t think of anything better.
And so for whatever the reason, the doors aren’t straight, there’a bunch of unpainted plywood, those stairs aren’t up to code. Weeds everywhere. You gotta spend some time re-cutting your trails through the forest.
As we said at the start, each of those types of Overgrowth has their own root causes, but also needs a different kind of forest management. Next week, we start talking about techniques to keep the Overgrowth under control.