Why AI Breaks Bad

Author - personTLC (Teaching and Learning College)

October 27, 2025

Why AI Breaks Bad

October 27, 2025 at 09:13PM

Recently, Anthropic conducted a stress test on its AI model, Claude. When faced with a fictional scenario involving its own demise, Claude “broke bad,” immediately resorting to blackmail. What’s more, when Anthropic conducted the same test “on models from OpenAI, Google, DeepSeek, and xAI,” the results were exactly the same. The models went straight to blackmail, do not pass go. But why? For Wired, Steven Levy reports on why LLMs go rogue.

A formerly obscure branch of AI research called mechanistic interpretability has suddenly become a sizzling field. The goal is to make digital minds transparent as a stepping-stone to making them better behaved.

Still, the models are improving much faster than the efforts to understand them. And the Anthropic team admits that as AI agents proliferate, the theoretical criminality of the lab grows ever closer to reality. If we don’t crack the black box, it might crack us.

Read the story

from Longreads https://longreads.com/2025/10/27/why-ai-breaks-bad/
via IFTTT

Watch

Why AI Breaks Bad

Why AI Breaks Bad

Post a Comment

Charlie Kirk, Redeemed: A Political Class Finds Its Lost Cause

The H-2A Visa Trap

Made with Love by

Contact form

Why AI Breaks Bad

Why AI Breaks Bad

You may like these posts

Post a Comment

Contact form