Hardware

4583 readers

39 users here now

All things related to technology hardware, with a focus on computing hardware.

Rules (Click to Expand):

Follow the Lemmy.world Rules - https://mastodon.world/about
Be kind. No bullying, harassment, racism, sexism etc. against other users.
No Spam, illegal content, or NSFW content.
Please stay on topic, adjacent topics (e.g. software) are fine if they are strongly relevant to technology hardware. Another example would be business news for hardware-focused companies.
Please try and post original sources when possible (as opposed to summaries).
If posting an archived version of the article, please include a URL link to the original article in the body of the post.

Some other hardware communities across Lemmy:

Augmented Reality - !augmented_reality@lemmy.world
Gaming Laptops - !gaminglaptops@lemmy.world
Laptops - !laptops@lemmy.world
Linux Hardware - !linuxhardware@programming.dev
Mechanical Keyboards - !mechanical_keyboards@programming.dev
Monitors - !monitors@piefed.social
Raspberry Pi - !raspberry_pi@programming.dev
Retro Computing - !retrocomputing@lemmy.sdf.org
Virtual Reality - !virtualreality@lemmy.world

Icon by "icon lauk" under CC BY 3.0

founded 2 years ago

MODERATORS

Alphane_Moon@lemmy.world

New X3D Ryzens to eliminate clock issues, 192MB model on the way (www.hwcooling.net)

submitted 1 month ago by MHLoppy@fedia.io to c/hardware@lemmy.world

16 comments fedilink hide all child comments

AMD prepares Ryzen 7 9850X3D and Ryzen 9 9950X3D2 CPUs with higher clocks and full 3D V-Cache on all cores. See what improvements are coming.

you are viewing a single comment's thread
view the rest of the comments

[–] brucethemoose@lemmy.world 2 points 1 month ago* (last edited 1 month ago) (4 children)

V-Cache also helps certain (local) AI workflows as well

I'm not aware of any, other than absolutely tiny embeddings models, maybe. Local ML stuff is usually compute or RAM bandwidth limited, and doesn't really fit in expanded L3.

AV1 encoding does love V-cache, last I checked. And like you said, it's potentially good for 'conserving' RAM bandwidth in mixed scenarios, though keep in mind that the CCDs can access each other's L3.

[–] fonix232@fedia.io 2 points 1 month ago (3 children)

AI workflows aren't limited to LLMs you know.

For example, TTS and STT models are usually small enough (15-30MB) to be loaded directly into V-cache. I was thinking of such small scale local models, especially when you consider AMD's recent forays into providing a mixed environment runtime for their hardware (GAIA framework that can dynamically run your ML models on CPU, NPU and GPU, all automagically)

[–] brucethemoose@lemmy.world 2 points 1 month ago (2 children)

Ah. I don’t really mess with local TTS and didn’t realize they were so small.

[–] fonix232@fedia.io 2 points 1 month ago (1 children)

No worries mate, we can't all be experts of every field and every topic!

Besides there are other AI models that are relatively small and depend on processing power more than RAM. For example there's a bunch of audio analysis tools that don't just transcribe information but also diarise it (split it up by speaker), extract emotional metadata (e.g. certain models can detect sarcasm quite well, others spot general emotions like happiness or sadness or anger), and so on. Image categorisation models are also super tiny, though usually you'd want to load them into the DSP-connected NPU of appropriate hardware (e.g. a newer model "smart" CCTV camera would be using a SoC that has NPU to load detection models into, and do the processing for detecting people, cars, animals, etc. onboard instead of on your NVR).

Also by my count, even somewhat larger training systems such as micro wakeword training, would fit into the 196MB V-Cache.

[–] brucethemoose@lemmy.world 1 points 1 month ago* (last edited 1 month ago)

Exactly! Not my area of expertise, heh.

There might even be niches in LLM land, like mamba SSM states, really tiny draft models, or other “cache” type things fitting into so much L3. This might already be the case with EPYC/TR stuff some homelab folks use.

It makes me wonder if the old AMD 6800 XT (with its 128MB of cache) would be good at this sort of “small model” thing.