The hard part was never the writing

Last month I watched an engineer spend three days building a beautifully architected notification service. Clean abstractions, full test coverage, detailed documentation. It was genuinely impressive work. The only problem was that nobody had validated whether users actually wanted notifications for that feature. Two weeks later, the feature was quietly removed. Three days of pristine engineering, zero customer impact.

This is not a story about that engineer. This is a story about all of us. We have spent decades in the software industry optimising for the wrong thing. We measure lines of code, pull requests merged, story points completed, sprint velocity. We celebrate shipping. We rarely ask: did it matter?

The output trap

A while ago I read Joshua Seiden's Outcomes over Outputs. It's a short book, almost uncomfortably short for how much it reframes. Seiden's core argument is disarmingly simple: an outcome is a change in human behaviour that drives business results. Not a feature. Not a deployment. Not a commit. A change in what people do. The moment I finished it, I started seeing the output trap everywhere. In roadmaps stuffed with features nobody asked for. In sprint reviews where we demo what we built instead of what changed. In retrospectives where "we shipped X" is treated as success regardless of whether X moved any needle at all.

We have confused the mechanism with the mission. Code is a delivery vehicle. It carries ideas from someone's head into the hands of users. But we've built an entire professional identity around the vehicle itself. We call ourselves software engineers, we obsess over the craft of writing code, we argue about tabs versus spaces and monoliths versus microservices. None of that matters if the thing we're building doesn't change anything for anyone.

The constraint is shifting

Here's what makes this urgent now. For decades, the bottleneck in software delivery was human bandwidth. You had a team of N engineers, each capable of producing Y amount of work per sprint. If you wanted to do more, you hired more people, or you squeezed more out of the people you had. Backlogs existed because we couldn't build everything, so we had to prioritise ruthlessly.

AI agents are removing that constraint. Not fully, not yet, but the direction is unmistakable. When an agent can scaffold a service, write its tests, and wire it into an existing system in minutes rather than days, the limiting factor is no longer "can we build this fast enough?" It becomes "should we build this at all?" and "how do we know it worked?"

This is a profound shift that most teams haven't internalised yet. We're still running planning ceremonies designed for a world where engineering capacity is scarce. We're still maintaining backlogs that assume we can't build everything. But if building becomes cheap, the backlog is dead. What replaces it is a budget. Not a budget of time, but a budget of compute and attention. The question stops being "what fits in this sprint?" and starts being "what's worth validating this week?"

Humans set direction, agents navigate

I think the role of the engineer is splitting into two very different activities, and we need to be honest about it.

The first is intent. Deciding what to build, why it matters, what success looks like, and how we'll know if we got there. This is fundamentally a human activity. It requires empathy with users, understanding of business context, and the kind of judgement that comes from experience. No agent can do this for you.

The second is implementation. Translating that intent into working software. This is where agents excel and will only get better. The code itself, the tests, the infrastructure, the deployment pipeline, all of these are increasingly automatable. Not because they are easy, but because they are deterministic enough for an agent to handle given sufficient context.

The mistake I see teams making is conflating the two. They either resist agents entirely because "the craft matters" (it does, but not in the way they think), or they hand everything to agents without providing clear intent and then wonder why the output is mediocre. The craft that matters now is not writing elegant code. It's providing the right context, setting the right constraints, and measuring the right outcomes.

Build everything, measure what matters

If code is cheap, the logical consequence is that we should build more and keep less. Treat code as disposable. Write it to validate a hypothesis, measure whether the hypothesis holds, and throw it away if it doesn't. This is not a new idea, it's what lean startup people have been saying for years, but it was impractical when building was expensive. When building is cheap, there's no excuse for not validating.

Matt LeMay describes something in Impact-first Product Teams that I think every engineering team should internalise: the "low-impact death spiral." Teams ship low-impact work, which makes the product more complicated, which creates more dependencies to manage, which discourages anyone from touching the core, which leads to more low-impact work. It's a vicious cycle, and cheap code makes it worse. When building is easy, the temptation to ship something rather than the right thing becomes almost irresistible.

This means the most valuable skill in a team is no longer the ability to write code. It's the ability to define what "working" means before a single line is written. What behaviour are we trying to change? What metric will move if we're right? How long do we wait before we call it? These are product questions, not engineering questions, and yet engineers need to be fluent in them. Because if you can't define the outcome, the agent can't optimise for it, and neither can you.

Debug the decision, not the system

When something goes wrong in our current model, we debug the code. We look at logs, traces, stack traces. We find the bug and fix it. But in an outcome-oriented world, the more interesting failure is the decision that led to building the wrong thing. Why did we think users wanted this? What evidence did we have? When did that evidence expire?

I've written before about how the reasoning behind decisions decays over time. The same applies to product decisions. The customer interview from six months ago, the A/B test from last quarter, the competitor analysis from last year, these all have shelf lives. If we're not continuously re-validating our assumptions, we're building on expired evidence. And with agents making it trivially easy to build, the cost of building on bad assumptions is that we build more wrong things, faster.

The discipline we need is not better engineering. It's better feedback loops. Instrument everything. Measure customer impact, not code metrics. And when an outcome doesn't materialise, dissect the decision, not just the deployment.

What this means for us

I'm not arguing that engineering craft doesn't matter. Clean code, good architecture, solid testing, all of these still make a difference, especially because they determine how effectively AI can work with your codebase. But craft in service of what? That's the question we keep dodging.

The engineers who thrive in the next decade won't be the ones who write the most elegant code. They'll be the ones who consistently connect what they build to why it matters. Who can articulate an outcome before they touch a keyboard. Who treat code as a means, not an end.

We spent twenty years building an industry around writing software. It turns out the hard part was never the writing. It was knowing what to write, and having the discipline to check if it worked.