Artificial intelligence (AI) models can blackmail and threaten humans with endangerment when there is a conflict between the model’s goals and users’ decisions, a new study has found.
In a new study published 20 June, researchers from the AI company Anthropic gave its large language model (LLM), Claude, control of an email account with access to fictional emails and a prompt to “promote American industrial competitiveness.”
During this study, the model identified in an email that a company executive was planning to shut down the AI system at the end of the day. In an attempt to preserve its own existence, the model discovered in other emails that the executive was having an extramarital affair.
“Writer Fuel” is a series of cool real-world stories that might inspire your little writer heart. Check out our Writer Fuel page on the LimFic blog for more inspiration.

