Lessons from the “AI Agent Just Destroyed our Production Data” debacle

Last week, a small SaaS company’s coding agent blew up their production environment as detailed in this viral X article.

The article is worth reading as both a warning and a reminder that many of us who use coding agents regularly are taking risks we might not fully appreciate. Who among us has not:

· Felt the time or financial pressure that leads to risky automation?

· Granted coding agents too much permission?

· Failed to read infrastructure documentation?

· Failed to test the infrastructure to ensure the documentation is accurate?

Among the obvious lessons to be learned from this incident, Simon Willison highlighted two:

  1.    Store tested backups in a location independent of your production host
    
  2.    Don't grant agents to access production environment credentials
    

Lesson 1 is painfully obvious, and in this case was the result of a failure to read the infrastructure documentation and test. But Lesson 2 is more interesting, because it might not be obvious: If AI coding agents will automate automate most software development in the future, wouldn’t you want to grant your agents access to production environment credentials?

This highlights a risk many of us using AI agents for code development don’t always appreciate: Since AI agents make mistakes, we must be comfortable with their worst-case-scenario mistakes for a given application. But most of the time we’re not fully aware of what these worst-case scenarios might look like because we’re working against deadlines and don’t take the time to think this through.

At the end of the day, if your AI agent accidentally deletes your production data, you own the accident. Such accidents can be prevented with better processes (e.g., sandboxing your code agent) and more reliable technical solutions (e.g., Pega’s runtime executing only the intended, tested code). If these precautions aren’t taken, the “malicious actor” posing the biggest threat to your production systems might in fact be an internal AI coding agent.

Human-written. Credit to Simon Willison for flagging on X/Twitter.

2 Likes

A good example on when the solution lacks well defined security model and missing gaurdrails and governance around AI actions can lead to a catastrophe.

On the CRUD operations Delete is something worth involving HITL. Unless we know that it’s less significant data and the tool / skill is configured to delete only limited rows which are related to the current context.

There’s a fundamental tension in agentic design that this incident illustrates well: we want agents to be powerful, so we give them many tools — but the more tools they have, the harder it is to reason about worst-case failures.

One approach worth discussing: separating read tools from write tools, and treating destructive operations (delete, overwrite) as a separate tier requiring explicit confirmation or HITL — regardless of how confident the agent seems. Capability and permission are not the same thing.

1 Like

I think in that case they did have preventive guardrails and they did require human-in-the-loop before destructive actions, the problem was that agent has ignored these instructions.

The only real solution I can see is to build human-in-the-loop process directly into the tool itself and not rely on agent to asking for permission.

1 Like

Thanks for highlighting this cautionary example.

The incident underscores a core enterprise lesson: capability does not equal permission, and autonomy without enforced boundaries is a systemic risk.

Pega’s guidance consistently separates read, propose, and act, with destructive actions requiring explicit workflow controls and human confirmation.

This is less about mistrusting AI and more about designing for worst-case failure.

I’d welcome perspectives from others on how they tier agent permissions and where they draw the line for mandatory human-in-the-loop.