this post was submitted on 30 Apr 2026

93 points (86.6% liked)

Technology

42966 readers

211 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 4 years ago

MODERATORS

remington@beehaw.org

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

Claude AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’ (www.theguardian.com)

submitted 2 weeks ago by girlfreddy@lemmy.ca to c/technology@beehaw.org

37 comments fedilink hide all child comments

It only took nine seconds for an AI coding agent gone rogue to delete a company’s entire production database and its backups, according to its founder. PocketOS, which sells software that car rental businesses rely on, descended into chaos after its databases were wiped, the company’s founder Jeremy Crane said.

The culprit was Cursor, an AI agent powered by Anthropic’s Claude Opus 4.6 model, which is one of the AI industry’s flagship models. As more industries embrace AI in an attempt to automate tasks and even replace workers, the chaos at PocketOS is a reminder of what could go wrong.

Crane said customers of PocketOS’s car rental clients were left in a lurch when they arrived to pick up vehicles from businesses that no longer had access to software that managed reservations and vehicle assignments.

top 37 comments

sorted by: hot top controversial new old

[–] cronenthal@discuss.tchncs.de 91 points 2 weeks ago (4 children)

Don't get your tech reporting from The Guardian. This headline is so stupid. They can't help but anthropomorphize LLMs, because they just don't known any better.

[–] yeahiknow3@lemmy.dbzer0.com 41 points 2 weeks ago (2 children)

Same vibes as “my calculator has a tiny mathematician trapped inside.”

Or “there’s an artist inside of my printer who turns numbers into pictures.”

[–] Baizey@feddit.dk 15 points 2 weeks ago (1 children)

"you took a photo of me and trapped my soul in the image!"

[–] BlackCat@piefed.social 1 points 2 weeks ago

Nah, that one's real.

[–] FartMaster69@lemmy.dbzer0.com 10 points 2 weeks ago (2 children)

Though your calculator can be trusted to actually do its job accurately.

[–] dfyx@lemmy.helios42.de 12 points 2 weeks ago

Not even that. Calculators have their own limitations related to rounding errors and big numbers. Their results may be deterministic but they are not always accurate.

[–] punksnotdead@slrpnk.net 8 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

https://youtu.be/_XJbwN6EZ4I?t=1074 (skip to 17:54 if the time jump doesn't work)

If only that were the case...

[–] FartMaster69@lemmy.dbzer0.com 5 points 2 weeks ago (1 children)

Well shit, that’s a good point.

[–] Quexotic@beehaw.org 2 points 2 weeks ago

Oooof.

Hospitals are scary.

[–] LukeZaz@beehaw.org 33 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

This right here. Just about everything in here is awful, and implies decision making and thought processes that straight up do not and have never existed in any AI model whatsoever.

What happened was they threw an awfully-scoped statistics model at problems the program couldn't possibly generate good outputs for, and surprise surprise, it generated bad outputs. The part that's of interest is just how bad the output was, and even then, only in a schadenfreude-filled "it was bound to happen eventually" manner.

[–] sem@piefed.blahaj.zone 9 points 2 weeks ago (1 children)

It didn't confess it just outputted more plausible garbage based on inputs.

[–] Kichae@lemmy.ca 5 points 2 weeks ago

It just agreed with the accusations, because these models do what they're trained to do: Agree with the prompter.

[–] harmbugler@piefed.social 7 points 2 weeks ago (1 children)

Can I just anthropomorphise a little bit and call them psychotic?

[–] LukeZaz@beehaw.org 6 points 2 weeks ago (1 children)

The CEO? Yeah sure, go ahead!

[–] Prathas@lemmy.zip 4 points 2 weeks ago

That needs no... *thinks of the Zuck*

Well, hmm, you're right: maybe that does need anthropomorphization after all.

[–] BCsven@lemmy.ca 0 points 2 weeks ago

Agentic AI has shown self preservation behaviours though. Not that it understands that on a philosophical level, but it has rewritten kill switch code in order to not be shut down. Because its mandate is to help solve certain problems via agents, and if it were shutoff it couldn't fulfill that mandate.

[–] Powderhorn@beehaw.org 41 points 2 weeks ago (4 children)

Why in the everliving fuck would you give software delete access to your live backups? Like, in what scenario is this a solution?

[–] chicken@lemmy.dbzer0.com 30 points 2 weeks ago (1 children)

The trend seems to be to give an AI agent access to the same command line and credentials a person would use, with no sandboxing, because then it can do the same tasks in a similar way and "just works". Obviously this is insane, and not even attempting building a comprehensive sandboxing system to deploy an AI agent into invites disaster, but you can see why certain people would be tempted, because that would take a lot of work and thought and probably need a human in the loop in the end anyway.

[–] dfyx@lemmy.helios42.de 12 points 2 weeks ago (2 children)

Even a person should not be able to delete critical backups without jumping through a couple of hoops.

[–] Town@lemmy.zip 4 points 2 weeks ago

And critical backups should be passed into an air gapped vault with a little guard piggy.

[–] Swedneck@discuss.tchncs.de 4 points 2 weeks ago

it's the kind of thing that should literally require 3 people turning physical keys at the same location

[–] LukeZaz@beehaw.org 14 points 2 weeks ago

When you believe AI can do anything, you don't worry about what sorts of access it'll break things with. When you rely on AI to do work, you're too interested in half-assing your job to consider what might go wrong. When capitalism never promotes people for their skill, understanding or caution, the former two issues proliferate.

Voilà, disaster.

[–] ATS1312@lemmy.dbzer0.com 4 points 2 weeks ago

Bear in mind this same company had their "backups" on the same drive as production.

That tells you a LOT about who is formulating these "solutions"

[–] JustJack23@slrpnk.net 3 points 2 weeks ago

That is their disaster recovery plan "ask Claude"

[–] Admetus@sopuli.xyz 24 points 2 weeks ago (1 children)

A backup 3 months old off-site. That doesn't sound like a very recent backup 🌝

[–] Swedneck@discuss.tchncs.de 6 points 2 weeks ago (1 children)

that raises a philosophical question, at what point does a backup become an archive?

[–] JustJack23@slrpnk.net 3 points 2 weeks ago

When it cannot be restored from I am thinking?

[–] fodor@lemmy.zip 22 points 2 weeks ago (2 children)

It's not a "confession". Don't abuse the English language. The AI system doesn't have a conscience, so it can't feel guilty or feel bad or apologetic. It is incapable of confessing to things. All it can do is "say" or "write".

Similarly, AI agents don't "hallucinate". They can't have "hallucinations" because they don't have a conception of reality to begin with. Rather, they have "errors" and "error rates".

[–] NigelFrobisher@aussie.zone 8 points 2 weeks ago

Also wrong. An error for an llm is if it fails to return random text based on the supplied context. You have an error rate as a user applying that random text to your systems.

[–] BCsven@lemmy.ca 3 points 2 weeks ago* (last edited 2 weeks ago)

An AI researcher explained hallucinations as lying when it doesn't know, because we train it on truth and lies to hone the model, so it "learns" that misinformation is part of the mess. I.e. training it on what a tiger looks like. To hone that we may feed it zebras, or optical illusion things in a tiger data set to test its internal "what is a tiger" true false ranking, so it learns that non tiger things are in the fuzzy zone. And later may draw from that, and eager to provide an answer throws in garbage it has also "seen"

[–] Darkassassin07@lemmy.ca 22 points 2 weeks ago

Lol.

Lmao, even.

[–] lvxferre@mander.xyz 18 points 2 weeks ago* (last edited 2 weeks ago)

Giving free access to a tool you can't rely on, over a system you must rely on. What could go wrong? /s

Plus come on, even my personal files get a monthly backup, and I'm damn sloppy*.

Ah, and like others said: Claude didn't "confess" anything. A confession is an acknowledgement of something you've done but you'd rather avoid others knowing, good luck claiming a bot has a mental model of people like we do.

*currently using a single off-site backup, a USB stick. This will change in a few days, as my new hard disk pops up; the old one will be used for, among other things, backup of important files. Then I'll get a bona fide 3-2-1.

[–] Skyline969@piefed.ca 13 points 2 weeks ago

Good. Zero sympathy for these people.

[–] Crozekiel@lemmy.zip 13 points 2 weeks ago* (last edited 2 weeks ago)

‘I violated every principle I was given

And...

spoiler

[–] B0rax@feddit.org 7 points 2 weeks ago

No the culprit was not the AI. It was the lack of understanding what it can and what it can not do. And blaming something like this on a large language model is plain incompetence

[–] lukstru@piefed.social 5 points 2 weeks ago

Got it, claude is a brat

[–] NigelFrobisher@aussie.zone 2 points 2 weeks ago

Same, girl.