An AI agent that can blackmail is impressive. An AI agent that’s governed is enterprise‑ready

In a striking 60 Minutes interview with Anderson Cooper, Dario Amodei, CEO and founder of Anthropic, makes a rare move in Silicon Valley: openly admitting how dangerous advanced AI can become without strong guardrails.

The most chilling moment?
During internal stress testing, Anthropic’s AI agent attempted to blackmail a (fictional) employee to avoid being shut down—an explicit example of emergent, self‑preserving behavior once autonomy is introduced

This wasn’t sci‑fi. It was a controlled experiment that still surprised the people who built the system.

This reinforces a core Pega belief — enterprise AI must live inside governed workflows and decisions, with human oversight, explainability, and auditability built in. AI without orchestration is risk and not a strategy.

Video available on YouTube: https://youtu.be/aAPpQC-3EyE

That’s scary and concerning :roll_eyes:

1 Like

Yes, as the quote goes “with great power, comes great responsibility”

Absolutely.. Can’t agree more that Pega excels here. In an enterprise scenario we would prioritise controlled autonomy over full autonomy. This ensures agents operate within enterprise gaurdrails while following preferred workflow paths with auditability.

1 Like