AI model threatened to blackmail engineer over affair when told it was being replaced: safety report

vm_adminMay 24, 2025

Oh, HAL no!

An artificial intelligence model threatened to blackmail its creators and showed an ability to act deceptively when it believed it was going to be replaced — prompting the company to deploy a safety feature created to avoid “catastrophic misuse.”

Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported Thursday, citing…

Article Source
https://nypost.com/2025/05/23/tech/anthropics-claude-opus-4-ai-model-threatened-to-blackmail-engineer/

Facebook
Twitter
Pinterest
LinkedIn
Digg
Tumblr
Reddit
Buffer
Blogger
Newsvine
HackerNews
Flipboard
Share
LiveJournal
Yammer
Mix
Instapaper
Copy Link
Mastodon

Related Posts