devaluing quality – burning web

I watched a Web software company ax its entire Quality Assurance (QA) department last week. It’s the sort of unhinged action taken by a department that is being controlled by its most problematic engineers. I still laugh in wonder every time I think about it, because it’s based on so many perfectly absurd assertions that it’s emblematic of the worst tendencies of our entire industry.

Let’s start at the beginning: When I was a senior engineer, I didn’t understand the value of QA. As an individual contributor, it’s very challenging to see the forest from the trees. Why can’t I just write a unit test to cover this condition? Why do I need to wait for this extra step, this delay, before I can ship my code and get my dopamine hit from shipping? Didn’t I already check it myself? Can’t I just A/B test it? Does that edge case even matter? And the sympathetic manager you complain to would love to reassign that “extra” payroll to an engineer that makes those Kanban cards close quicker, right?

This entire line of thinking is broken and toxic. It’s predicated on the perceived exceptionality of a few individual contributors, those mythical “10x engineers”, that must be appeased and enabled. Once you’ve bought into that idea, you’re already screwed. Software building is a team activity, and QA is an indispensable stakeholder on a well-functioning team.

Any fool can cobble together a half-baked prototype over the weekend that stumbles into production for a year and “works”. I’ve been that fool! It’s not hard, in retrospect. What requires a functioning team and engineering leadership is building a thing designed to last more than a fiscal year or three that quantifiably succeeds in meeting real user needs. Surviving engineers onboarding & offboarding, promotions, technology changes, strategy pivots and the addition of features few could predict at the outset — these require a system, not a savant. That’s hard.

When I was applying to my current role as a Web engineering leader, I was asked to do a panel presentation on how I would structure an engineering team and what my immediate priorities would be. In a nutshell, my answer was to establish process & policies that built trust (both inside the team and cross-team), and to start that strategy with hiring a QA test automation engineer. Bugs waste time. Support requests from defects waste time. Unclear interfaces waste time. Unexpected results waste time. The later in the process you find a bug, the more time you waste. Efficient engineering teams don’t have time to waste.

In a well-functioning software team, QA forms the third leg of the stable tripod that leads to successful delivery. The first is the product manager/owner who understands the business need and how it would be best to interact with the users. The second is the engineers who understand the technical limitations & requirements for its implementation. The third is the QA / test automation engineer who verifies the first two parties are speaking the same language and the intended value actually reaches the customer. Can you run a software team without QA (or without a product owner)? Sure you can! We call them startups, and they fail all the time because they’re running on luck and instinct. Let me know if you figure out how scale either of those qualities.

Thinking an engineer can handle writing their own tests and checking their own work has an easy-to-spot fundamental flaw: They can only test what they understand. You have assumed a flawless transmission from original product intent into technical implementation and 100% insight into all potential problems. In other words, you have assumed you have a hero savant engineer instead of a fallible human doing their best. Once you decide your business relies on heroes, the heroes become a bottleneck and everyone else is used at a fraction of their capacity. You were so afraid of wasting effort on checking work (and understanding) that you instead waste it across your entire team at a far higher rate.

One of the great challenges of managing an engineering team is deriving metrics from their performance to “prove” anything. One of the few “un-gameable” metrics I’ve learned about is measuring the time it takes for a code change to be deployed to customers after an engineer starts work on it. One term for it is Mean Lead Time for Changes (MLTC). Since implementing a sound process that included test automation and a dedicated QA stakeholder, my team’s MLTC dropped like a rock, from more than two week down to two days. And they don’t even have continuous deployment hooked up yet!

There are no 10x engineers, but there are 10x engineering teams if you have the patience and skill to build them up. Trying to do that without a quality stakeholder is a fool’s errand.