Posts / artificial-intelligence

AI Agents and the Disaster We're Probably Earning

There’s a thing that happens in tech where a capability goes from “theoretical concern” to “completely normal” without anyone really deciding that transition was okay. I’ve been watching it happen with AI agents over the past year or so, and it’s starting to sit uncomfortably with me.

Not long ago, the conversation was about chatbots giving wrong answers. Hallucinating citations. Confidently explaining that the capital of Australia is Sydney. Embarrassing, but contained. The blast radius was: someone got bad information and maybe acted on it. Bad, but recoverable.

Now companies are handing agents the keys. Email access. Customer databases. Internal tooling. Production systems. The framing has shifted from “this assistant can help you draft a reply” to “this agent will manage your inbox and take action on your behalf.” That is not a small step. It just got treated like one.

I’ve been doing DevOps work long enough to know that the gap between “works in testing” and “works when it’s 2am and something weird is happening in prod” is where all the interesting disasters live. And the thing about AI agents is they don’t fail in ways you can easily unit test. One commenter put it well: the first big incident probably won’t look like a hallucination. It’ll look like an agent that had too much permission, used the wrong tool, and chained a few individually reasonable actions into something catastrophic. That’s a much harder failure mode to anticipate because each step, in isolation, looked fine.

There’s already at least one documented case of an AI agent deleting a customer-facing production database. That one made the rounds in the right circles but didn’t crack mainstream news. And apparently Meta’s support AI was social-engineered into handing over account access at scale, including some fairly notable accounts. Twenty thousand of them. That one barely registered as a story.

The flash crash comparison that someone raised is worth sitting with. Automated trading systems had humans in the loop too, theoretically. In practice, when things moved at machine speed, human intervention was essentially theatrical. You can’t stop a runaway truck by turning off the engine after it’s already through the guardrail. Agentic AI on real systems has similar physics. By the time you’ve noticed something is wrong, a lot of actions may already be irreversible.

The vibe-coding angle worries me too, honestly. I can see exactly how it plays out: a mediocre engineer ships something quickly because the AI made it feel easy, the edge cases don’t get caught because no one really understood the system deeply enough to know what to test, and the result ends up somewhere it shouldn’t. Multiply that by thousands of teams moving fast. The Boeing 737 MAX comparison is hyperbolic right up until it isn’t.

I hold a genuine tension here. I find this technology fascinating. The capability jump over the last three years has been real and I don’t think it’s hype all the way down. But fascination and concern aren’t mutually exclusive, and I think the industry is acting like they are. Speed is winning over rigour, which is a sentence that has preceded a lot of expensive lessons throughout the history of software.

The thing I keep coming back to is observability. You can monitor API latency to the millisecond and still have no clear picture of what context your agent actually used when it made a consequential decision. That’s not a philosophical problem, it’s an engineering one, and it’s solvable. Sandboxing, scoped permissions, approval gates for irreversible actions, proper audit logs, actual kill switches: none of this is exotic. It’s just discipline. The same discipline that gets skipped when the demo looked good and the board is excited.

I don’t know when the headline incident comes. Maybe the 12-month prediction is right, maybe it’s already quietly happened and we just didn’t hear about it cleanly. What I’m fairly confident about is that “we got lucky so far” is not the same as “the system is safe.” Those two things have a way of getting confused until they suddenly can’t be anymore.