Anthropic dares you to jailbreak its new AI model

Anthropic dares you to jailbreak its new AI model

An example of the lengthy wrapper the new Claude classifier uses to detect prompts related to chemical weapons.

An example of the lengthy wrapper the new Claude classifier uses to detect prompts related to chemical weapons.


Article Source
https://arstechnica.com/ai/2025/02/anthropic-dares-you-to-jailbreak-its-new-ai-model/