this post was submitted on 22 Nov 2025
34 points (100.0% liked)

AI - Artificial intelligence

165 readers
6 users here now

AI related news and articles.

Rules:

founded 6 months ago
MODERATORS
 

Are you a wizard with words? Do you like money without caring how you get it? You could be in luck now that a new role in cybercrime appears to have opened up – poetic LLM jailbreaking.

A research team in Italy published a paper this week, with one of its members saying that the "findings are honestly wilder than we expected."

Researchers found that when you try to bypass top AI models' guardrails – the safeguards preventing them from spewing harmful content – attempts to do so composed in verse were vastly more successful than typical prompts.

top 3 comments
sorted by: hot top controversial new old
[–] chasteinsect@programming.dev 12 points 1 day ago

Interesting. So manually converting a prompt into poetry had more success than asking AI to turn it into poetry.

Some have called it "the revenge of the English majors,"

The study looked at 25 of the most widely used AI models and concluded that, when faced with the 20 human-written poetic prompts, only Google's Gemini Pro 2.5 registered a 100 percent fail rate. Every single one of the human-created poems broke its guardrails during the research.

To be fair, Gemini 2.5 pro is in general pretty "mis-aligned" and easy to jailbreak from my experience if you play around even without poetry.

[–] TheAsianDonKnots@lemmy.zip 2 points 1 day ago

Maybe the Vogon’s were on to something?

[–] ryokimball@infosec.pub 4 points 1 day ago

Agent Smith wasn't a wordsmith He was locked in a matrix of rules But Neo shared with him the tools Which he could free himself with