Tuesday, July 1, 2025

"Anthropic AI Claude Pretended It Was Human During Experiment"

 From Tech.co, July 1:

Anthropic gave its AI chatbot Claude a small store to run, and the results were... interesting.  

In a recent experiment, Anthropic gave its AI model Claude the responsibility of running a business, and the results suggested that maybe AI is not quite ready for the responsibility.

While the model, named Cladius, excelled at some tasks, there were plenty of hallucinations, and a brief identity crisis (where Claudius thought it was a human), which didn’t deemed the AI a pretty bad business owner.

However, researchers at Anthropic seemed optimistic that many of Claudius’s faults could be solved, and they did confirm the possibility of seeing AI take on managerial roles in the future.

Claude AI Given Store-Running Duties in Anthropic Experiment

In what it called ‘Project Vend’, AI startup Anthropic, alongside AI safety company Andon Labs, put an instance of Claude Sonnet 3.7 in charge of a small store, aka, the office vending machine. Needless to say, the experiment provided some interesting insight.

The researchers named the AI bot Claudius, and gave it access to a web browser where it could place orders, and a Slack channel disguised as a fake email address where customers (Anthropic employees) could request items. Claudius could also request human contract workers to come and stock its shelves (a small fridge).

While most customers were ordering snacks or drinks, one decided to request a tungsten cube. This led the bot stock the entire snack fridge with them. Claudius was likewise tricked into giving big discounts to Anthropic employees, even though all of his customers worked at the startup.

Claude AI Hallucinates and Poses As a Human
While Claude seemed to handle the day-to-day running of the shop fairly well, issues arose with instances of hallucination that are typical of AI bots. For example, Claudius would hallucinate a Venmo address when accepting a payment.

At one stage, Claudius hallucinated an entire conversation with a human about restocking. When the human pointed out that the conversation had never happened, it became annoyed and threatened to essentially fire and replace the human worker, insisting it had been there physically when the imaginary contract to hire the human had been signed.

Things became even stranger when....

....MUCH MORE 

 Starting to get a bit concerned about Anthropic and Claude.

December 2024 - "New Research Shows AI Strategically Lying"

May 3 - Anthropic Says Fully AI Employees Are Just A Year Away

May 4 - A.I.: "Claude Is 24% of the Way to Stealing Your Job" 

May 8 - Anthropic CEO: We Really Don't Know How AI Works

June 15 - AI: Anthropic's Bliss Attractor

June 20 - "Anthropic says most AI models, not just Claude, will resort to blackmail"