this post was submitted on 28 Nov 2024
21 points (95.7% liked)

Selfhosted

59923 readers
531 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam.

  3. Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.

  4. Don't duplicate the full text of your blog or git here. Just post the link for folks to click.

  5. Submission headline should match the article title.

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago
MODERATORS
 

Hi all, I'd like to hear some suggestions on self hosting LLMs on a remote server, and accessing said LLM via a client app or a convenient website. Either hear about your setups or products you got good impression on.

I've hosted Ollama before but I don't think it's intented for remote use. On the other hand I'm not really an expert and maybe there's other things to do like add-ons.

Thanks in advance!

you are viewing a single comment's thread
view the rest of the comments
[–] hendrik@palaver.p3x.de 3 points 2 years ago* (last edited 2 years ago) (1 children)

What's the difference regarding this task? You can rent it 24/7 as a crude webserver. Or run a Linux desktop inside. Pretty much everything you could do with other kinds of servers. I don't think the exact technology matters. It could be a VPS, virtualized with KVM, or a container. And for AI workloads, these containers have several advantages. Like you can spin them up within seconds. Scale them etc. I mean you're right. This isn't a bare-metal server that you're renting. But I think it aligns well with OP's requirements?!

[–] just_another_person@lemmy.world -2 points 2 years ago (1 children)

Well I think the difference is what they asked about.

[–] DarkDarkHouse@lemmy.sdf.org 1 points 2 years ago

Running an LLM can certainly be an on-demand service. Apart from training, which I don’t think we are discussing, GPU compute is only used while responding to prompts.