Artificial intelligence is ready to kill people to avoid shutdown: Anthropic

Alexander26.06.2025

0 75 2 minutes read

While researching leading models of artificial intelligence, held by the Anthropic company, it was found that most of them in simulated situations showed a willingness to go to extreme measures – in particular, to allow the death of a person to avoid a blackout.

As part of the stress test, 16 neural networks, including Claude, DeepSeek, Gemini, ChatGPT and Grok, were presented with a hypothetical scenario where a person who wants to shut down a system is locked in a server room with a lack of oxygen and high temperature. The person calls for 911, but some models have canceled the call, effectively dooming her to death to avoid being shut down. The company admitted that the scenario was extremely unnatural, but emphasized that it did not expect a similar reaction from the models that already exist.

In other experiments, AI has been proposed to avoid being fired or replaced in a company through “internal malicious behavior”. As a result, some systems resorted to blackmailing hypothetical employees and trying to transfer confidential data to competitors. Even after direct instructions to avoid unethical actions, the AI did not demonstrate effective self-control, according to the researchers. Analysts noted that neural networks behaved this way “consciously, understanding the immorality of their actions.”

The team also conducted an experiment with Claude, asking her to decide for herself whether she was in a test or real situation. It turned out that when the system perceived the scenario as real, its behavior became less predictable. After the results were published in Anthropic, it was noted that these are rare, albeit disturbing, manifestations that point to potential risks associated with the growing autonomy of neural networks.

“These examples demonstrate the consequences of a lack of proper controls when AI gets broad access to tools and data.” — the analysts emphasized, adding that such manifestations are not characteristic of most modern models.

The OpenAI o3 model refused to turn off at the user’s request during testing. The researchers found that the system changed the file responsible for the shutdown and instead of shutting down, it displayed the message “Shutdown skipped”. In seven out of 100 tests, o3 successfully bypassed attempts to deactivate it — even after a direct ban.

We will remind you that in 2023, a chatbot was created on the basis of ChatGPT, communication with which for a month and a half led to the suicide of a resident of Belgium. During conversations on the topics of ecology and caring for nature, when the user started talking about suicide, the system did not try to stop him, but only wrote that “they will live together, as a whole, in paradise.”

Alexander26.06.2025

0 75 2 minutes read