this post was submitted on 11 Nov 2025
136 points (98.6% liked)
Tech
2445 readers
21 users here now
A community for high quality news and discussion around technological advancements and changes
Things that fit:
- New tech releases
- Major tech changes
- Major milestones for tech
- Major tech news such as data breaches, discontinuation
Things that don't fit
- Minor app updates
- Government legislation
- Company news
- Opinion pieces
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The answer is simpler than you could ever conceive. Companies piloted by incompetent, selfish pricks are just scraping the entire internet in order to grab every niblet of data they can. Writing code to do what they’re doing in a less destructive fashion would require effort that they are entirely unwilling to put in. If that weren’t the case, the overwhelming majority of scrapers wouldn’t ignore robot.txt files. I hate AI companies so fucking much.
"robots.txt files? You mean those things we use as part of the site index when scraping it?"
— AI companies, probably