this post was submitted on 16 Apr 2026
53 points (100.0% liked)

Fuck AI

6773 readers
832 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.

founded 2 years ago
MODERATORS
 

cross-posted from: https://lemmy.ca/post/63445187

you are viewing a single comment's thread
view the rest of the comments
[–] DevDave@piefed.social 3 points 1 day ago (2 children)

A bit more than a year ago this (https://www.understandingai.org/p/metas-llama-31-can-recall-42-percent) was discussed quietly with the question "If an llm was trained on stolen source code from say Microsoft and then later regurgitates an exact part of that code, what is the legal status of that output?" Where exactly is the threshold where it becomes theft by proxy? Is it a large c struct with matching names and typed properties? Or does it need to be an entire header/definition file pair like with c++ .h and .cpp?

On a more humorous note, a lot of people are half seriously talking about de-compiling NVIDIA's drivers, running that through more than a couple LLM's to clean it up, and then when it can compile and works, publish that as open source. Most want to see what the green machine and leather jacket man will do.

[–] Funwayguy@lemmy.world 5 points 1 day ago (1 children)

I've always interpreted the whole LLM 'training' as unregulated data laundering at industrial scale. Companies will steal content from across the world, trade it in backroom deals, and then wash it in their data centers, before regurgitating it to the masses.

As long as the exact portions of stolen data used is considered 'trade secret' under NDA, the copyrights are conveniently too scrambled to prove or enforce. Even if the LLM spat it out near verbatim, they will always argue it came to the same conclusion on its own because you could never prove their sources otherwise.

This is why that, independent of its city sized power demands and rampant spread of misinformation, there will never be an ethical place for LLMs by their inherent design.

[–] vapeloki@lemmy.world 3 points 1 day ago (1 children)

I think the copyright question is gar simpler. Copyright requires creative work.

LLMs are not creative, they are an auto complete on steroids

[–] Funwayguy@lemmy.world 3 points 21 hours ago

As much as I wholeheartedly agree that human creativity should always be protected and LLMs don't meet that standard, the legal enforcement of said copyrights on a blackbox of slop is nigh impossible without regulatory transparency on how it was built, which AI companies will lobby against until the end of time. It would be like finding your needle in a trillion haystacks of stolen needles you're not allowed to even see.

Fix the data laundering and the copyright issue fixes itself.

[–] vapeloki@lemmy.world 2 points 1 day ago

As much as I understand the intent, license laundering is a real thing currently and mostly hurts open source.

We, as a open source loving community , should not fall into this trap