Some great answers here already, I'd just add that Gemma 4 31B is a surprisingly good model for its size (competitive with massive models like DeekSeek/GLM 5.1/etc.) and also it's worth visiting https://www.reddit.com/r/SillyTavernAI and https://www.reddit.com/r/LocalLLaMA if you're just getting into local models.
Perchance - Create a Random Text Generator
⚄︎ Perchance
This is a Lemmy Community for perchance.org, a platform for sharing and creating random text generators.
Feel free to ask for help, share your generators, and start friendly discussions at your leisure :)
This community is mainly for discussions between those who are building generators. For discussions about using generators, especially the popular AI ones, the community-led Casual Perchance forum is likely a more appropriate venue.
See this post for the Complete Guide to Posting Here on the Community!
Rules
1. Please follow the Lemmy.World instance rules.
- The full rules are posted here: (https://legal.lemmy.world/)
- User Rules: (https://legal.lemmy.world/fair-use/)
2. Be kind and friendly.
- Please be kind to others on this community (and also in general), and remember that for many people Perchance is their first experience with coding. We have members for whom English is not their first language, so please be take that into account too :)
3. Be thankful to those who try to help you.
- If you ask a question and someone has made a effort to help you out, please remember to be thankful! Even if they don't manage to help you solve your problem - remember that they're spending time out of their day to try to help a stranger :)
4. Only post about stuff related to perchance.
- Please only post about perchance related stuff like generators on it, bugs, and the site.
5. Refrain from requesting Prompts for the AI Tools.
- We would like to ask to refrain from posting here needing help specifically with prompting/achieving certain results with the AI plugins (
text-to-image-pluginandai-text-plugin) e.g. "What is the good prompt for X?", "How to achieve X with Y generator?" - See Perchance AI FAQ for FAQ about the AI tools.
- You can ask for help with prompting at the 'sister' community Casual Perchance, which is for more casual discussions.
- We will still be helping/answering questions about the plugins as long as it is related to building generators with them.
6. Search through the Community Before Posting.
- Please Search through the Community Posts here (and on Reddit) before posting to see if what you will post has similar post/already been posted.
It REALLY depends on two things :
1- your model of choice.
2- your specifications.
I have 32gb ram (ddr5 @ 6000mhz), rx 6950 xt with 16 gb of vram and an i5-13600k, the model that I got to run comfortably range between 24b and 27b, anything higher than that got my pc screaming for mercy plus slowdowns and incoherent output. With 6 gb of vram, I highly advise you to run Sao10Ks L3 8B Stheno v3.2, either Q3 quants or Q4 quants, but I highly recommend that you download the Q3 quants (for higher token count), assuming that you have a 16gb of ram, you can run between 8k and 12k tokens, which is higher than the token count that the perchance model have (6k tokens), also, use koboldcpp and connect it to sillytavern, it's a bit complicated but once you get it running, it's a very smooth sail.
On a side note, if you could get your hands on a clean card with a lot vram like an rtx 3090 with 24gb of vram on the used market (got one for less than 400 bucks), you can run much, much better models locally, models that would nuke perchance's Ai chat into oblivion with much higher context windows, like Gemma 4 31b, WeirdCompound family, Cydonia v4.1 24b... Etc.
nothing is forever especially on the internet.
So true!
You can start, for free, with Ollama, and then start trying out models.
Most models need about 9Gb of storage. You might want more space, at first to try a few out, side-by-side.
Different models will require different amounts of RAM and CPU, and some models are really slow on weaker PCs.
There seems to be lots that can be tuned, too. So if it doesn't run right away, try searching the web for whatever issues you encounter, and see if others suggest a settings change, before you give up.
Hi, and Yes like pinball_wizard said, get ollama then maybe sticking with the 6GB get a small model like tiny llama from 500mb to just over 1GB, get servez for Local hosting a simple html from the model or LM Studio and use the same smaller models but watch the ram usage and stick with lower models to start with. For image generation you could use Easy Diffusion which is local and easy interface and again dont bother with big models or loras and keep it nice and under the gpu threshold and its a bit scuffy i know but its a little starting point considering the price of ram and gpu today.