this post was submitted on 04 May 2026
356 points (98.9% liked)
Technology
84354 readers
3323 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
RAM's main advantage over HDDs/SSDs is fast access times.
Needing to fetch anything over the internet would make it faster to just use HDDs.
Theoretically, you could do whatever processing you need using the user's CPU and RAM and then send the result back over the Internet. Not saying that's what's happening, of course, but it's not completely ridiculous.
That’s what distributed computing is, after all. Like Folding@Home.
Does GeForce Now support Chrome or Firefox?
But it isn't this idea just kind of a reverse cloud? Since running AI is so expensive, they could "borrow" other people's ram. Just an idea.
Sounds like you're looking for zebras when horses are a much simpler explanation.
Conceptually what you're describing is feasible, there's lots of distributed computing projects that borrow compute/space/bandwidth for their ends but is unlikely to have any practical use.
If there were a distributed system that could be used as memory in a large virtual inferencing machine, it would be incredibly slow. The model would be stored across a large number of different computers which would all have to coordinate. Each step of inferencing would be orders of magnitude slower because the latency between two different computers is orders of magnitude slower than the latency between a GPU and physical RAM.
On the other end, if we just assume inferencing is feasible in reasonable time through some technique that isn't public... a model that was large enough to take advantage of an Internet worth of memory doesn't exist.
So, assuming you had access to the largest model that we know of, you would have a model as complex as Claude Opos, but would take hours or days to respond finish inferencing and the quality would be about the same as you could get in under a second for $20/mo.
And, going with a hypothetical 'Internet Scale' model.
First, it would have to be trained which would use take an incredibly long time. Some of these frontier models take months to train on the fastest hardware available, a larger model would take even longer to train due to the increased latency.
More importantly, there are strong diminishing returns on capability vs model size. This is why the AI companies are focusing on agentic tasks, where the AI spends a lot of time talking to itself and using tools, rather than pushing for a model with more parameters. This is referred to as the "scaling wall" (though AI companies, for obvious reasons, deny that such a thing exists and smoking companies say there's no cancer risk for smokers).
It's a neat idea (Skynet may be loose on the world, hiding out as widespread 'bugs' that happen to consume a lot of resources and compute), but it would require a lot of things to be magic'd into existence to be remotely practical.
You may find this funny: https://youtu.be/JcJSW7Rprio