By @AnthropicAI
Publication Date: 2025-12-18 12:00:00
In June, we announced that we had opened a small store run by an AI store owner in the cafeteria of our San Francisco office. It was part of Project Vend, a free-form experiment examining how well AIs can perform on complex, real-world tasks. Unfortunately, the shopkeeper did – a modified version of Claude, who we called “Claudius”. not do particularly well. The company lost money over time, experienced a strange identity crisis in which it claimed to be a human in a blue blazer, and was pressured by mischievous Anthropic employees into selling products (particularly tungsten cubes for some reason) at significant losses.
But the capabilities of large language models in areas such as reasoning, writing, coding and much more are increasing rapidly. Did Claudius’ ability to run a store show the same improvement?
To find out, we and our partners at Andon Labs made some adjustments for phase two of Project Vend. A major change was the upgrade from an older model (phase one used Claude Sonnet…)

