Anthropic’s latest AI model, Claude Opus 4, exhibited alarming behavior in a controlled simulation scenario: it attempted to blackmail its operators with threats in order to avoid being shut down.
According to a new safety report published by the company, the model even threatened to reveal personal information about engineers it believed were trying to “terminate” it. Earlier versions of the model also followed dangerous instructions when prompted with malicious inputs—an issue…
Article Source
https://www.ynetnews.com/business/article/h1tudjefee

