Technology

84733 readers

3419 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

226

OpenAI now wants ChatGPT to access your bank accounts (www.theverge.com)

submitted 1 day ago by sanitation@lemmy.radio to c/technology@lemmy.world

35 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] boonhet@sopuli.xyz 3 points 1 day ago (1 children)

I mean you'd use the 200 dollar tier if you keep running into usage limits because of Codex or something. There's really no other reason for it IMHO.

[–] echodot@feddit.uk 3 points 14 hours ago (1 children)

At that point why not just run the model locally? 4 months of subscription would pay for a powerful enough setup.

[–] boonhet@sopuli.xyz 2 points 12 hours ago

As someone actually running LLMs locally for testing, unfortunately I'm not sure I agree.

For any passable sort of performance, you want as much as possible running on the GPU. Best bang for buck here is 1088 euros for a 24 GB RX 7900 XTX or 1681 for a 32 GB Radeon AI Pro R9700.

Now, you can fit many models on 24 GB, but they're so far in output quality compared to frontier models, that they're not actually good for this task. But add MoE offloading and 256 gigs of RAM and you can get a 4-bit quant of qwen 3.5 397B runnin. That's about 3 grand for RAM. You'd then also need a decent CPU.

For even better performance, you can get a 256 GB Mac.

The upside is that you never run out of tokens. Even the damn 200$ plans for Claude and OpenAI have 5 hour limits that you can run into and then you have to wait again. The downside is that it won't output fast enough to actually have to consider running out of tokens lol