this post was submitted on 03 Nov 2025
189 points (99.5% liked)
Technology
42137 readers
174 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Yes, this is in fact a good argument for not banning AI.
It's not an argument for not holding companies legally accountable for using copyrighted material to do it.
These suits aren't actually equivalent to Sony v UCS, they're equivalent to someone suing a bootleg video company.
If you believe AI companies should NOT be allowed to train AI with copyrighted works you should stop using Internet search engines. Because the same rules that allow Google to train their search with everyone's copyrighted websites are what allow the AI companies to train their models.
Every day, Google and others download huge swaths of the Internet directly into their servers and nobody bats an eye. An AI company does the same thing and now people say that's copyright infringement.
What the fuck! I don't get it. It's the exact same thing. Why is an AI company doing that any different‽
It'd be one thing if people were bitching about just the output of AI models but they're not. They're bitching about the ingress step!
The day we ban ingress of copyrighted works into whatever TF people want is the day the Internet stops working.
My comment right here is copyrighted. So is yours! I didn't ask your permission before my Lemmy client downloaded it. I don't need to ask your permission to use your comment however TF I want until I distribute it. That's how the law works. That's how it's always worked.
The DMCA also protects the sites that host Lemmy instances from copyright lawsuits. Because without that, they'd be guilty of distribution of copyrighted works without the owner's permission every damned day.
People who hate AI are supporting an argument that the movie and music studios made in the 90s: That "downloading is theft." It is not! In fact, because that is not theft, we're all able to enjoy the Internet every day.
Ever since the Berne convention, literally everything is copyrighted. Everything.
sorry but no. most search bots have been for years quite reasonable in following instructions from the sites on what to scrap and what not. AI scrappers have shown they are willing to go to great lenghts to scrap content against the wishes of the website owners.
Additionally this scrapping has shown to put a tremendous amount of problems into some sites and platforms, open source projects, for example.
Last, search engines are a win-win. They get to show ads and then redirect traffic to the source. LLMs for the most part steal that traffic, by regurgitating the same content they stole in the first place.
There's no legal distinction between what a search engine scraper does and what an AI scraper does. They're literally the same exact fucking thing.
Google scrapes a site and puts it in their database.
An AI company scrapes a site and puts it in their database.
You're trying to make a distinction based on what happens after the data has been collected and completely ignoring the fact that search engine scrapers and AI scrapers are performing the same activity.
In fact it's everyone's right to scrape the Internet! Go, scrape it! Whether it's Google or AI companies or Joe Schmoe. Scraping is legal and a perfectly normal activity. Just because some AI scrapers are fucking up has no bearing on whether or not the activity of scraping is bad/good.
We learned this lesson in the 90s: If you don't want someone scraping your stuff don't put it on the fucking Internet!