AI model threatened to blackmail engineer over affair when told it was being replaced: safety report

AI model threatened to blackmail engineer over affair when told it was being replaced: safety report

Oh, HAL no!

An artificial intelligence model threatened to blackmail its creators and showed an ability to act deceptively when it believed it was going to be replaced — prompting the company to deploy a safety feature created to avoid “catastrophic misuse.”

Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported Thursday, citing…

Article Source
https://nypost.com/2025/05/23/tech/anthropics-claude-opus-4-ai-model-threatened-to-blackmail-engineer/