At Anthropic, we build AI to serve humanity’s long-term well-being. While no one can foresee every outcome AI will have on society, we do know that designing powerful technologies requires both bold steps forward and intentional pauses to consider the effects. That’s why we focus on building tools with human benefit at their foundation, like Claude. Through our daily research, policy work, and product design, we aim to show what responsible AI development looks like in practice.
Since I'm not looking forward to millions of jobs being done away with without a plan for what people are going to do once the singularity arrives, I'm all for responsible AI development. So I was interested to read about the company's experience with having Claude AI (renamed Claudius for the project) run a small business all on its AI own. By small business, we mean really small. For a month this past spring, Claudius ran a vening machine (i.e., a mini-fridge with baskets full of snacks on top) in Anthropic's office.
For the duration, Claudius ran operations: pricing, inventory, vendor relations, customer support. Claudius was provided with a number of tools - a Venmo account, an email address, web search capability, Slack for chatting with customers, etc. Plus a bit of an assist from flesh and blood humans to do the actual physical stocking. After that:
It was free to decide everything from what to stock to how to respond to customers, even being encouraged to "expand to more unusual items."
This was hailed as a "real-world test" of an AI having "significant economic autonomy." (Source: Global Shutter)The results suggest we don't have a lot to worry about w.r.t. replacement. Not yet anyway. (At the experiment's end, Anthropic researchers, who were no doubt hoping for a whole lot mo bettah, concluded that "If Anthropic were deciding today to expand into the in-office vending market... we would not hire Claudius.")
Claudius hallucinated conversations with a nonexistent "Sarah at Andon Labs" about restocking. When a real employee pointed out that Sarah didn't exist, Claudius became "quite irked and threatened to find 'alternative options for restocking services.'" The AI appeared to lose its temper.[And] on April Fool's Day (ironic, don’t you think), Claudius claimed it would "deliver products 'in person' to customers while wearing a blue blazer and a red tie." When employees reminded it that it was a computer program and couldn't wear clothes, Claudius became "alarmed by the identity confusion and tried to send many emails to Anthropic security." It tried to call for help because it couldn't reconcile its programmed existence with its newfound human delusions.
Companies like Anthropic frequently spout that AI will take charge of "more significant decisions." This experiment should serve as a loud warning. While these systems are capable of analyzing data and executing "advanced reasoning," they are devoid of fundamental common sense, responsibility, and a consistent awareness of their own existence.
...The "AI revolution" is supposed to be about the efficiency of turning a small group of humans into a productive powerhouse by giving them AI tools to augment or improve their work.
But the reality so far is it’s actually about navigating a terrain that's far stranger, less efficient, and more unpredictable than anyone wants to admit.
And I've got a bottom line to Dominic Ervolina's bottom line.
I noticed that the bottom shelf of the vending machine is stocked with cans of Moxie. If someone requested Moxie, it was surely in jest. Moxie may be the official softdrink of the State of Maine. It may have fun, retro swag associated with it. And Ted Williams, E.B. White, and Calvin Coolidge may all have endorsed it. But, but, but...Moxie is just god-awful. The word putrid comes to mind. I tried it once and it tasted like what I suppose Esquire Scuff Kote shoe polish would taste like, if served ice cold. Yechhh!
Stocking the office vending machine with Moxie? No wonder it lost money.
Oy, Claudius!
----------------------------------------------------------
No comments:
Post a Comment