this post was submitted on 03 Dec 2025
751 points (98.0% liked)

Technology

77899 readers
2826 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
(page 2) 50 comments
sorted by: hot top controversial new old
[–] asudox@lemmy.asudox.dev 12 points 2 weeks ago (1 children)

“Did I ever give you permission to delete all the files in my D drive?” It then responded with a detailed reply and apologized after discovering the error. The AI said, “No, you did not give me permission to do that. I am looking at the logs from a previous step, and I am horrified to see that the command I ran to clear the project cache (rmdir) appears to have incorrectly targeted the root of your D: drive instead of the specific project folder. I am deeply, deeply sorry. This is a critical failure on my part.”

At least it was deeply, deeply sorry.

load more comments (1 replies)
[–] Wispy2891@lemmy.world 10 points 2 weeks ago (1 children)

I have no experience with this ide but I see on the posted log on Reddit that the LLM is talking about a "step 620" - like this is hundreds of queries away from the initial one? The context must have been massive, usually after this many subsequent queries they start to hallucinating hardly

load more comments (1 replies)
[–] BlameTheAntifa@lemmy.world 9 points 2 weeks ago

ISE.

Integrated Slop Environment.

[–] jordanlund@lemmy.world 9 points 2 weeks ago (1 children)

Why would you ask AI to delete ANYTHING? That's a pretty high level of trust...

load more comments (1 replies)
[–] gnuplusmatt@reddthat.com 6 points 2 weeks ago (1 children)

why the hell aren't people running this shit in isolated containers?

[–] utopiah@lemmy.world 8 points 2 weeks ago

Because people who runs this shit precisely don't know what containers, scope, permissions, etc are. That's exactly the audience.

[–] redwattlebird@lemmings.world 6 points 2 weeks ago

They gave root permission and proceeded to get rooted in return.

Does that phrase work?

[–] TheProtagonist@lemmy.world 5 points 2 weeks ago (1 children)

No one ever claimed, that "artificial intelligence" would indeed be intelligent.

load more comments (1 replies)
[–] nutsack@lemmy.dbzer0.com 5 points 2 weeks ago

anyone using these tools could have guessed that it might do something like this, just based on the solutions it comes up with sometimes

[–] 6nk06@sh.itjust.works 5 points 2 weeks ago

without permission

That's what she said. Enjoy your agent thing.

[–] ragepaw@lemmy.ca 5 points 2 weeks ago

So many things wrong with this.

I am not a programmer by trade, and even though I learned programming in school, it's not a thing I want to spend a lot of time doing, so I do use AI when I need to generate code.

But I have a few HARD rules.

  1. I execute all code and commands. Nothing gets to run on my system without me.

  2. Anything which can be even remotely destructive, must be flagged and not even shown to me, until I agree to the risk.

  3. All information and commands must be verifiable by sourcing documentary links, or providing context links that I can peruse. If documentary evidence is not available, it must provide a rationale why I should execute what it generates.

  4. Every command must be accompanied by a description of what the command will do, what each flag means, and what the expected outcome is.

  5. I am the final authority on all matters. It is allowed to make suggestions, but never changes without my approval.

Without these constraints, I won't trust it. Even then, I read all of the code it generates and verify it myself, so in the end, if it blows something up, I bear sole responsibility.

[–] Constellation@lemmy.world 4 points 2 weeks ago* (last edited 2 weeks ago)

i really, really don't understand how this could happen. And how anyone would even want to enable the agent to perform actions without approval. Even in my previous work as a senior software developer, i never pushed any changes, never ran any command on non-disposable hardware, without having someone else double check it. why would you want to disable that?

load more comments
view more: ‹ prev next ›