this post was submitted on 29 Dec 2025
960 points (99.4% liked)

Programmer Humor

28099 readers
1699 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] yannic@lemmy.ca 9 points 13 hours ago (1 children)

Everyone here so far has forgotten that in simulations, the model has blackmailed the person responsible shutting it off and even gone so far as to cancel active alerts in order to prevent an executive laying unconscous in the server room from receiving life-saving care.

[–] AwesomeLowlander@sh.itjust.works 14 points 12 hours ago* (last edited 12 hours ago) (1 children)

The model 'blackmailed' the person because they provided it with a prompt asking it to pretend to blackmail them. Gee, I wonder what they expected.

Have not heard the one about cancelling active alerts, but I doubt it's any less bullshit. Got a source about it?

Edit: Here's a deep dive into why those claims are BS: https://www.aipanic.news/p/ai-blackmail-fact-checking-a-misleading

[–] yannic@lemmy.ca 3 points 12 hours ago (1 children)

I provided enough information that the relevant source shows up in a search, but here you go:

In no situation did we explicitly instruct any models to blackmail or do any of the other harmful actions we observe. [Lynch, et al., "Agentic Misalignment: How LLMs Could be an Insider Threat", Anthropic Research, 2025]

[–] AwesomeLowlander@sh.itjust.works 11 points 12 hours ago

Yes, I also already edited my comment with a link going into the incidents and why they're absolute nonsense.