512
Motherboard sales are now collapsing amid unprecedented shortages fueled by AI
(www.tomshardware.com)
This is a most excellent place for technology news and articles.
You can use something like KoboldCPP on Linux, which allows both RAM and VRAM combined to run a model. O'course, not as fast when compared to pure VRAM or the Mac approach, but it is an option. I use my 128gb RAM with some GPUs for running models.
Ollama and llama.cpp allow it too but it's super slow in my experience.