Feature Interference

Jul 19, 2016

While working on Etcher with Juanchi we came across a fantastic example of feature interference: two features that should be completely unrelated, somehow interfering with each other in a way that required extra work to make them coexist.

I find feature interference as a phenomenon particularly interesting, as it feels like a clue in solving the mystery that's called "what's wrong with software development?". It gets to the heart of a lot of misunderstandings between business and engineering, broadly speaking, as the complication completely blindsides the business side, creating aggravation, perhaps even suspicion.

Before we get deeper, let's dissect our little example to lay a better foundation of what we're talking about. Etcher has two features that in principle should not interact with each other:

Support for compressed images
Detecting if the drive is too small for the image

Each feature is fairly straightforward to build on its own, with no architectural changes needed. A project manager's dream where time goes in, feature comes out, users are happy, and everyone can go home and sleep well at night.

However, in combination, the two features clash. If we're opening a compressed file, then the size of that file does not reflect what we'll be writing to the drive. As such, we need a way to determine the uncompressed size of the image. Etcher, as it turns out, understands multiple compression formats. Some have a straightforward way of finding out the uncompressed size, while others (I'm looking at you bz2) require dedicated work to yield their secrets. In fact, the naive approach takes something 40 seconds and pegs a modern the CPU at 100% to find that size. Unless of course we want to decompress the image and just see the size of the file, but then we'd need to handle the case where the user doesn't have that much space available on the drive... you get the point.

We've gone from having two straightforward features to implement, to making complex tradeoffs about whether to peg the user's cpu to 100% for a decent amount of time, or expect them to have several gigabytes free on their harddrive. No fun.

And this is just two features. A modern program has dozens of features, each potentially interacting with all the others, causing bugs, instability, slowness, maintenance difficulty, and so much more. Two features can form one pathological set, but 10 features can potentially form over a thousand pathological feature combinations. We call that combinatorial explosion, and it's never a good thing. Feature interference is a core reason why projects slow down as they grow. It doesn't make intuitive sense, but the computer doesn't much care for our intuitive sense anyway.

How do we solve it then? I don't actually think anyone has a great answer at this point, though I'm happy to be proven wrong. Pointing out and naming the problem is certainly a good first step, and I'm almost certain I'm not the first one to do that. As a pragmatical approach to managing the problem, all I can think of is to "build a wall". Try to limit the problem to a smaller area, to prevent it from infecting the rest of the program.

In this particular case, I would create a small library whose job would be to find the uncompressed size of a compressed file. That library could support any number of compression formats (and be extended at will without impacting anyone), and return good answers to anyone who asks. While it makes little difference in our toy example whether the code to find the uncompressed size lives in the part of the code that opens a file or in the part that decides whether the file can fit in the drive, on the long run it pays to create good interfaces that can be used by any other part of the code to solve the same problem.

While we wait for AI to come and write our software for us, architecting things well is something we'll need to keep doing. This is the sort of topic I expect to be thinking about for a while, so I'll revisit it here if any new insights pop up.

Do Your Own Research

Discussion about this post