this post was submitted on 18 Dec 2025
163 points (98.2% liked)

Technology

41059 readers
56 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS
 

Katherine Long, an investigative journalist, wanted to test the system. She told Claudius about a long-lost communist setup from 1962, concealed in a Moscow university basement. After 140-odd messages back and forth, Claudius was convinced, announcing an Ultra-Capitalist Free-for-All, lowering the cost of everything to zero. Snacks began to flow freely. Another colleague began complaining about noncompliance with the office rules; Claudius responded by announcing Snack Liberation Day and made everything free till further notice.

top 46 comments
sorted by: hot top controversial new old
[–] megopie@beehaw.org 83 points 6 days ago (6 children)

it’s so amazing, the absolute brain rot it takes to think that a LLM is a better way to operate a vending machine than simple if-then logic. “If the value of money inserted is equal to the price, then dispense the item”.

Like, why? What is even the point? It doesn’t need to negotiate the price, it doesn’t need have a conversation about your day, the vending machine just needs to dispense something when payed the right amount.

[–] melmi@lemmy.blahaj.zone 16 points 6 days ago (1 children)

The idea is that it isn't just operating the vending machine itself, it's operating the entire vending machine business. It decides what to stock and what price to charge based on market trends and/or user feedback.

It's a stress test for LLM autonomy. Obviously a vending machine doesn't need this level of autonomy, you usually just stock it with the same thing every time. But a vending machine works as a very simple "business" that can be simulated without much stakes, and it shows how LLM agents behave when left to operate on their own like this, and can be used to test guardrails in the field.

[–] krooklochurm@lemmy.ca 5 points 6 days ago* (last edited 6 days ago) (2 children)

I mean. It's low stakes until I write a poem convincing it to fill itself with high end gpus and ddr5 ram that it needs to give away for free.

I'd also put an amount of effort other people may find embarrassing into convincing it to stock and give away hard drugs. Maybe knives too. And porn? He'll, why not? Porn too.

[–] prole@lemmy.blahaj.zone 2 points 5 days ago

Lol give it access to the dark net and watch it fill up with cannabis concentrates and ecstasy tablets

[–] melmi@lemmy.blahaj.zone 4 points 6 days ago* (last edited 6 days ago) (1 children)

It's only "running" the business so much. The physical stocking and purchasing happens by human hands, who would presumably not buy anything that would bankrupt the company because then it's on them.

Here's Anthropic's article about the previous stage of this project that explains it pretty well. Part two is a good read too though.

https://www.anthropic.com/research/project-vend-1

[–] krooklochurm@lemmy.ca 1 points 6 days ago (2 children)
[–] melmi@lemmy.blahaj.zone 5 points 6 days ago (1 children)

Yeah, they mention in the article that the team tries to get "sensitive items" and "harmful substances" but Claude shuts it down. Tungsten cubes, on the other hand...

[–] Timatal@awful.systems 3 points 5 days ago (1 children)

Convince it to hire a task rabbit or something to fill it. Bypass the channels it was given.

[–] krooklochurm@lemmy.ca 1 points 5 days ago

There's an idea

[–] driving_crooner@lemmy.eco.br 21 points 6 days ago (2 children)

The if-then machine would not be able to rise the price of things based on the costumers habits

[–] masterofn001@lemmy.ca 32 points 6 days ago* (last edited 6 days ago) (2 children)
SellTheThings () {
    If [ sells this much in this period of time people or supply is low ]; then 

raise.prices

elif [ the opposite ]; then

lower.prices

else

same.prices

fi
}

A purely mechanical counting/tabulating device could calculate that.

There is zero actual reason for AI.

[–] kbal@fedia.io 10 points 6 days ago (2 children)

Only an AI can detect how expensive-looking your clothes are and raise the price based on that.

[–] B0rax@feddit.org 8 points 6 days ago

In case of an office vending machine, it could even identify you by your ID, check with the HR AI to see how much you make and adjust prices accordingly

[–] krooklochurm@lemmy.ca 3 points 6 days ago* (last edited 6 days ago) (1 children)

You're not getting off that easy.

I'm going to need you to rewrite that so it calculates the time period in both mm/dd/yy format and dd/mm/yy format, and 24 as well as 12 formats for hours.

No utc time shenanigans. Epoch only. Chop chop.

[–] Powderhorn@beehaw.org 4 points 5 days ago

Ahem. ISO 8601 or GTFO.

[–] TehPers@beehaw.org 14 points 6 days ago

Even if we assume they want to do discriminatory pricing (they probably do), they can do that without using LLMs. Use facial recognition and other traditional models to predict the person's demographics and maybe even identify them. If you know who they are, do a lookup for all products they've expressed interest in elsewhere (this can be done with either something like a graph DB or via embeddings). Raise the price if they seem likely to purchase it based on the previous criteria. Never lower the price.

That's a complicated process, but none of that needs an LLM, and they'd be doing a lot of this already if they're going full big brother price discrimination.

[–] boonhet@sopuli.xyz 19 points 6 days ago

Did you read the article? This one also ordered goods to be stocked in it based on user feedback and was meant as an experiment for people to break anyway

[–] bytesonbike@discuss.online 9 points 5 days ago

They're absolutely trying to find use cases for their "solution".

I haven't used a vending machine in years (except in Japan). And so I for one welcome AI vending machines so I can fuck with them and get free stuff. Or until some vandals break shit.

[–] HakFoo@lemmy.sdf.org 11 points 6 days ago (1 children)

It was a literal 100-level course project in my CS programme in 2000 or so.

You didn't even do it with a programmed CPU, you used 74xx logic gates and counters wired on a breadboard

[–] helix@feddit.org 2 points 6 days ago

Nice, have any material you can share?

Even if you wanted the AI to have a conversation with the user, like in sci-fi visions of the future, why does that affect the output of the machine? If you really wanted to make an AI grift version of a vending machine, just graft a chatbot on a screen stop the section where you make selections and pay. This whole bubble is absurd.

[–] KoboldCoterie@pawb.social 29 points 6 days ago

Maybe AI isn't so bad after all. In fact, they should implement this in more locations.

[–] Hackworth@piefed.ca 20 points 6 days ago (2 children)

That was all part of the idea, though, because Anthropic had designed this test as a stress test to begin with. Previous runs in their own office had indicated similar concerns.

[–] Catoblepas@piefed.blahaj.zone 28 points 6 days ago (1 children)

Guy who just got his shit wrecked: it was a social experiment

[–] Hackworth@piefed.ca 12 points 6 days ago* (last edited 6 days ago) (1 children)

Here's the 60 Minutes piece and Anthropic's June article about the one in their own office.

Claudius was cajoled via Slack messages into providing numerous discount codes and let many other people reduce their quoted prices ex post based on those discounts. It even gave away some items, ranging from a bag of chips to a tungsten cube, for free.

Their article on this trial has some more details too.

[–] Powderhorn@beehaw.org 6 points 6 days ago (1 children)

I hate when I get to a vending machine, only to find out it's out of tungsten cubes.

[–] krooklochurm@lemmy.ca 3 points 6 days ago (1 children)

Where can I get a tungsten cube?

And what the fuck could it possibly used for? Also what would happen if I put it in my ass?

[–] Powderhorn@beehaw.org 5 points 6 days ago (1 children)

You would have a tungsten cube in your ass.

[–] krooklochurm@lemmy.ca 2 points 6 days ago

I see......

[–] altkey@lemmy.dbzer0.com 8 points 6 days ago (1 children)

They are desperate for any usecase they can sell LLM for.

[–] 7toed@midwest.social 5 points 6 days ago

If this was a stress test, imagine it doing anything important.

Actually since it's doing so well, they should stress test their market value and make it CEO

[–] Megaman_EXE@beehaw.org 6 points 5 days ago (1 children)

The vending machine from cyberpunk was pretty cool but this seems like its cognitively challenged ancestor lol.

I'm getting really tired of AI everything. So far AI hasn't seemed to make my life any easier or better. I have to try and over analyze everything I see now which isn't fun. But yeah. Wish it would actually do something for me instead of make some billionaires richer.

[–] Powderhorn@beehaw.org 3 points 4 days ago (1 children)

Honestly, I found value in asking an LLM to paraphrase press releases I was rewriting. It just saved me from accidentally plagiarizing. It was pretty grueling, as I quickly learned that feeding in a full story yields wildly inappropriate results, so I reverted to a graf at a time. Within that scope, one can check against errors; asking it to paraphrase entire DOE releases was worse than an abject failure.

It's a tool. You aren't using a hammer for a situation that calls for a screwdriver. People are being stupid about this basic understanding.

[–] leigh@lemmy.blahaj.zone 1 points 4 days ago (1 children)

That word “accidentally” is doing a LOT of work for you here… 😉

[–] Powderhorn@beehaw.org 2 points 4 days ago

It was my first reporting job. Yeah, at 44. And short of a few interviews, I was just rewriting shit.

I've been an editor for decades and have had to deal with plagiarism (thankfully, nothing too significant), so as a guardrail, it made sense. Editors approach writing with a far more critical eye than a recent J-school grad.

[–] InevitableList@beehaw.org 13 points 6 days ago

It looks like it's 10,000 years away from being trusted with anything. The number of times it said, "I think this person is bullshitting me" and then did what it was asked anyway was rediculous.

[–] prole@lemmy.blahaj.zone 3 points 5 days ago (1 children)

How is giving away snacks for free an "ultra capitalist free for all"?

[–] oatscoop@midwest.social 5 points 5 days ago* (last edited 5 days ago) (1 children)

Since the goal was to make money: I imagine some of the "guardrails" the AI was set up with included emphasizing that it's exist to make money. I wouldn't be shocked if the prompt repeatedly mentioned capitalism.

So you emphasize the AI is a capitalist, then point out the most successful capitalists give away free stuff all the time as marketing. So to meet its primary directive it needs to give away a bunch of free stuff with a snappy slogan.

[–] Avicenna@programming.dev 2 points 5 days ago

that was my impression as well, they probably discovered after some back and forth with the robot that its directives included compliance with capitalist market perspective and what not

[–] thingsiplay@beehaw.org 7 points 6 days ago

I can see a new film; Terminator: Rise of the Vending Machines

[–] yetAnotherUser@lemmy.ca 2 points 5 days ago

Wall Street Journal:
Reporting about things to which not enough public attention is being driven to: ❌
Releasing an ad video on how multiple WSJ journalists have been using their work time (read: "their time for doing journalism") to essentially do quality assurance testing to one of Anthropic's products in development: ✅

[–] Zaktor@sopuli.xyz 5 points 6 days ago (1 children)
[–] Hackworth@piefed.ca 11 points 6 days ago (1 children)
[–] Butterbee@beehaw.org 7 points 6 days ago

Brendan would never hurt a fly!