this post was submitted on 27 Feb 2026
105 points (100.0% liked)

Technology

42381 readers
170 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 4 years ago
MODERATORS
 

Taalas HC1: 17,000 tokens/sec on Llama 3.1 8B vs Nvidia H200's 233 tokens/sec. 73x faster at one-tenth the power. Each chip runs ONE model, hardwired into the transistors.

you are viewing a single comment's thread
view the rest of the comments
[–] notabot@piefed.social 73 points 3 days ago (19 children)

Dedicated, single purpose, chip designs are always going to be faster and more efficient to run than general purpose ones. The question will be what the environmental, and financial costs will be of updating to a new model. With a general purpose design it's just a case of liading sone new code. With a model that's baked into the silicon you have to design and manufacture new chips, then install them.

I can see this being useful in certain niche usecases where requirements are not going to change, but it sounds rather limiting in the general case.

[–] morto@piefed.social 6 points 3 days ago (4 children)

fpgas can sort of be a middle ground, but i don't know if they're capable of running llms

[–] bryndos@fedia.io 2 points 3 days ago (2 children)

Is there such a thing as modular fpga so that you could "plug in" another one and add more gates, sort of daisy chain them? I don't know if such interfaces exist , sounds like it might need lots of bandwidth.

[–] iceberg314@midwest.social 1 points 2 days ago

I bet you could! The interface and literally be what ever you want with FPGAs. You'd just have to keep things organized and program them one at a time I think

load more comments (1 replies)
load more comments (2 replies)
load more comments (16 replies)