this post was submitted on 14 Jan 2025
40 points (90.0% liked)
Selfhosted
59939 readers
433 users here now
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam.
-
Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.
-
Don't duplicate the full text of your blog or git here. Just post the link for folks to click.
-
Submission headline should match the article title.
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Again, you'd be waiting around all day
Yeah I found some stats now and indeed you’re gonna wait like an hour to process if you throw like 80-100k token into a powerful model. With APIs that kinda works instantly, not surprising but just to give a comparison. Bummer.
Application Programming Interface, are you talking about something on the internet? On a gpu driver? On your phone?
Then also, what's the size model you're using? Define with int32? fp4? Somewhere in between? That's where ram requirements come in
I get that you're trying to do a mic drop or something, but you're not being very clear
Are you drunk?
No, just calling your bluff. git gud m8
You’re aware that there’s the OpenAI API library right? https://github.com/openai/openai-python
It’s really nothing fancy especially on Lemmy where like 99% of people are software engineers…
Eyy, a web api! You could've just said that right away. There's more than just web api's.
How is this web api relevant in your choice of hardware to locally run these models?
Congrats on being that guy
Throwing money at a problem works, next time try to know what you're doing
Anyways, the important thing is the "TOPS" aka trillions of operations per second. Having enough ram in important, but if you don't have a fast processor than you're wasting ram while you can just stream it from a fast ssd.
One such cases is when your system can't handle more than 50 tops, like the apple m systems. Try an old gpu, and enjoy 1000's of tops