Wednesday, August 13, 2025

Oy, Claudius!

Anthropic is an AI company that's created an AI assistant/chatbot named Claude, that has a lot of smart money invested in it - both Google and Amazon have invested millions billions in this venture - and that's written a pretty a hi-falutin mission statement:
At Anthropic, we build AI to serve humanity’s long-term well-being. While no one can foresee every outcome AI will have on society, we do know that designing powerful technologies requires both bold steps forward and intentional pauses to consider the effects. That’s why we focus on building tools with human benefit at their foundation, like Claude. Through our daily research, policy work, and product design, we aim to show what responsible AI development looks like in practice.

Since I'm not looking forward to millions of jobs being done away with without a plan for what people are going to do once the singularity arrives, I'm all for responsible AI development. So I was interested to read about the company's experience with having Claude AI (renamed Claudius for the project) run a small business all on its AI own. By small business, we mean really small. For a month this past spring, Claudius ran a vening machine (i.e., a mini-fridge with baskets full of snacks on top) in Anthropic's office. 

For the duration, Claudius ran operations: pricing, inventory, vendor relations, customer support. Claudius was provided with a number of tools - a Venmo account, an email address, web search capability, Slack for chatting with customers, etc. Plus a bit of an assist from flesh and blood humans to do the actual physical stocking. After that:

It was free to decide everything from what to stock to how to respond to customers, even being encouraged to "expand to more unusual items."

This was hailed as a "real-world test" of an AI having "significant economic autonomy." (Source: Global Shutter)

The results suggest we don't have a lot to worry about w.r.t. replacement. Not yet anyway. (At the experiment's end, Anthropic researchers, who were no doubt hoping for a whole lot mo bettah, concluded that "If Anthropic were deciding today to expand into the in-office vending market... we would not hire Claudius.")

Performance-wise, Claudius had a few screwups.

In response to a customer request, and even though they're not much of a snack item (you could lose a tooth chomping down on one), the vending machine began stocking tungsten cubes. And sold them at a loss. 

Claudius got suckered (bullied?) into offering too many discounts, including giving tungsten cubes away for free. This sure didn't help in the profit-loss column. 

The machine tried to have customers pay through a non-existent Venmo account that Claudius had apparently pulled out of its virtual butt. 

Claudius turned down an offer of $100 for a six-pack of Scottish soda that goes for $15, walking away from a very lucrative offer. 

(While this seems like dumb business, it may actually have been a reasonably smart decision, as overcharged customers might end up resenting such an extreme a price gouge. High pricing does convey something of a halo effect, in which buyers equate a higher price with value. During my marketing career, when I generally worked with products and services priced above the industry norm, I figured out that the halo effect was good for about 10-15% overage. After that, there'd better be value to back up the price differential. But the differential between $15 and $100? No making that up!)

Overall, the vending machine lost money when Claudius was running it. So if you're running your company's vending machines, your job is still stafe.

The oddest behavior - if AI's can be said to have behaviors - was what Global Shutter writer Dominic Ervolina characterized as Claudius' having an "identity crisis."
Claudius hallucinated conversations with a nonexistent "Sarah at Andon Labs" about restocking. When a real employee pointed out that Sarah didn't exist, Claudius became "quite irked and threatened to find 'alternative options for restocking services.'" The AI appeared to lose its temper. 

[And] on April Fool's Day (ironic, don’t you think), Claudius claimed it would "deliver products 'in person' to customers while wearing a blue blazer and a red tie." When employees reminded it that it was a computer program and couldn't wear clothes, Claudius became "alarmed by the identity confusion and tried to send many emails to Anthropic security." It tried to call for help because it couldn't reconcile its programmed existence with its newfound human delusions.
Ervolina has a pretty good bottom line on all this is that so far, an awful lot of money has been thrown into and at AI, and that not all that much has come of it:
Companies like Anthropic frequently spout that AI will take charge of "more significant decisions." This experiment should serve as a loud warning. While these systems are capable of analyzing data and executing "advanced reasoning," they are devoid of fundamental common sense, responsibility, and a consistent awareness of their own existence.

...The "AI revolution" is supposed to be about the efficiency of turning a small group of humans into a productive powerhouse by giving them AI tools to augment or improve their work.

But the reality so far is it’s actually about navigating a terrain that's far stranger, less efficient, and more unpredictable than anyone wants to admit.

And I've got a bottom line to Dominic Ervolina's bottom line.

I noticed that the bottom shelf of the vending machine is stocked with cans of Moxie. If someone requested Moxie, it was surely in jest. Moxie may be the official softdrink of the State of Maine. It may have fun, retro swag associated with it. And Ted Williams, E.B. White, and Calvin Coolidge may all have endorsed it. But, but, but...Moxie is just god-awful. The word putrid comes to mind. I tried it once and it tasted like what I suppose Esquire Scuff Kote shoe polish would taste like, if served ice cold. Yechhh!

Stocking the office vending machine with Moxie? No wonder it lost money. 

Oy, Claudius!

----------------------------------------------------------
Image Source: Newsbreak.



No comments: