this post was submitted on 18 Nov 2025
336 points (98.0% liked)
Fediverse memes
2244 readers
100 users here now
Memes about the Fediverse.
Rules
General
- Be respectful
- Post on topic
- No bigotry or hate speech
- Memes should not be personal attacks towards other users
Specific
- We are not YPTB. If you have a problem with the way an instance or community is run, then take it up over at !yepowertrippinbastards@lemmy.dbzer0.com.
- Addendum: Yes we know that you think ml/hexbear/grad are tankies and or .world are a bunch of liberals but it gets old quickly. Try and come up with new material.
Elsewhere in the Fediverse
Other relevant communities:
- !fediverse@lemmy.world
- !yepowertrippinbastards@lemmy.dbzer0.com
- !lemmydrama@lemmy.world
- !fediverselore@lemmy.ca
- !bestofthefediverse@lemmy.ca
- !fedigrow@lemmy.zip
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
And yet, federation means that each instance should know all the other domain names, yes? So do daily DNS lookups of all IP addresses associated with federation and auto-whitelist them.
Sure, if you have to then configure cloudflare with these IPs, it’ll require an API to do so automatically.
But otherwise if you are running some sort of throttling protection on the actual box or VM the instance is sitting on, it should be rather trivial to update it directly, especially if said throttling software is doing Linux correctly and drawing its whitelist from a flat file.
New instances (and not just Lemmy instances, but Mastodon and other fediverse instances) are coming online all the time, so you need a way to let them through to start the federation process. There are thousands, so it needs to be automatic, you can't require a new instance sends whitelisting requests to ever server one of their users might want to interact with (instances aren't linked unless a local user subscribes to something on a remote instance).
Given the AI bots seem to just be indiscriminately scraping web pages, I excluded API endpoints from blocking anyway. Another admin showed me a nice Cloudflare rule to do this, though media can still be a problem due to how it's individual users on other instances that are loading it so it's hard to block scrapers without blocking users, which is another way Cloudflare helps (static media files are easily cached by their CDN).
This isn’t via an API endpoint explicitly for that purpose that bots would normally not utilize?
And why not have a process by which admins from a new instance poke the admins of another instance - any other instance, so long as it’s already a part of the network - to do an initial manual whitelist that could cascade through the entire system?
Then there should be ways that the software itself can auth with other instances of itself, via a common encryption protocol. While this would only work with like software, the key point being that only a toehold is needed to start propagating.
The point being, there are options. Some of them quite simple.
Realistically, federation is not the main concern. You can leave all your API endpoints open to bots and not have a problem because they are loading the web app. Just block the web app for suspicious traffic.
ActivityPub already uses authentication to some extent with other instances, it's the first contact where you have to have trust.
My main concern is still that media is loaded directly from users in most cases, the APIs are not a problem right now as the bots aren't specifically targeting Lemmy. There are ways to address this but Lemmy (and other threadiverse services) don't have full time dev teams, they work on what they can or want to work on given the very low hourly rate.