noneabove1182

joined 2 years ago
MODERATOR OF
 

So you don't have to click the link, here's the full text including links:

Some of my favourite @huggingface models I've quantized in the last week (as always, original models are linked in my repo so you can check out any recent changes or documentation!):

@shishirpatil_ gave us gorilla's openfunctions-v2, a great followup to their initial models: https://huggingface.co/bartowski/gorilla-openfunctions-v2-exl2

@fanqiwan released FuseLLM-VaRM, a fusion of 3 architectures and scales: https://huggingface.co/bartowski/FuseChat-7B-VaRM-exl2

@IBM used a new method called LAB (Large-scale Alignment for chatBots) for our first interesting 13B tune in awhile: https://huggingface.co/bartowski/labradorite-13b-exl2

@NeuralNovel released several, but I'm a sucker for DPO models, and this one uses their Neural-DPO dataset: https://huggingface.co/bartowski/Senzu-7B-v0.1-DPO-exl2

Locutusque, who has been making the Hercules dataset, released a preview of "Hyperion": https://huggingface.co/bartowski/hyperion-medium-preview-exl2

@AjinkyaBawase gave an update to his coding models with code-290k based on deepseek 6.7: https://huggingface.co/bartowski/Code-290k-6.7B-Instruct-exl2

@Weyaxi followed up on the success of Einstein v3 with, you guessed it, v4: https://huggingface.co/bartowski/Einstein-v4-7B-exl2

@WenhuChen with TIGER lab released StructLM in 3 sizes for structured knowledge grounding tasks: https://huggingface.co/bartowski/StructLM-7B-exl2

and that's just the highlights from this past week! If you'd like to see your model quantized and I haven't noticed it somehow, feel free to reach out :)

 

PolyMind is a multimodal, function calling powered LLM webui. It's designed to be used with Mixtral 8x7B + TabbyAPI and offers a wide range of features including:

Internet searching with DuckDuckGo and web scraping capabilities.

Image generation using comfyui.

Image input with sharegpt4v (Over llama.cpp's server)/moondream on CPU, OCR, and Yolo.

Port scanning with nmap.

Wolfram Alpha integration.

A Python interpreter.

RAG with semantic search for PDF and miscellaneous text files.

Plugin system to easily add extra functions that are able to be called by the model. 90% of the web parts (HTML, JS, CSS, and Flask) are written entirely by Mixtral.

 

Open source

Open data

Open training code

Fully reproducible and auditable

Pretty interesting stuff for embeddings, I'm going to try it for my RAG pipeline when I get a chance, I've not had as much success as I was hoping, maybe this english-focused one will help

 

Thanks to Charles for the conversion scripts, I've converted several of the new internLM2 models into Llama format. I've also made them into ExLlamaV2 while I was at it.

You can find them here:

https://huggingface.co/bartowski?search_models=internlm2

Note, the chat models seem to do something odd without outputting [UNUSED_TOKEN_145] in a way that seems equivalent to <|im_end|>, not sure why, but it works fine despite outputting that at the end.

 

Based off of deepseek coder, the current SOTA 33B model, allegedly has gpt 3.5 levels of performance, will be excited to test once I've made exllamav2 quants and will try to update with my findings as a copilot model

[–] [email protected] 1 points 2 years ago

Yeah definitely need to still understand the open source limits, they're getting pretty dam good at generating code but their comprehension isn't quite there, I think the ideal is eventually having 2 models, one that determines the problem and what the solution would be, and another that generates the code, so that things like "fix this bug" or more vague questions like "how do I start writing this app" would be more successful

[–] [email protected] 1 points 2 years ago (2 children)

I've had decent results with continue, it's similar to copilot and actually works decently with local models lately:

https://github.com/continuedev/continue

 

Trying something new, going to pin this thread as a place for beginners to ask what may or may not be stupid questions, to encourage both the asking and answering.

Depending on activity level I'll either make a new one once in awhile or I'll just leave this one up forever to be a place to learn and ask.

When asking a question, try to make it clear what your current knowledge level is and where you may have gaps, should help people provide more useful concise answers!

[–] [email protected] 1 points 2 years ago

Yes agreed on the llama-2 models, they show a LOT of promise in the right tasks but they need some work to get back to what we remember from peak llama-1, i'm very excited for when that arrives in a week or two!

Yeah by all means! At this time I'd say text-generation-webui is my most mature and functional image, with koboldcpp being a close second but I just don't work as closely with it

lollms-webui is a very interesting upcoming platform but it's a solo dev so it's a lot of work, my docker image works as long as you don't need any personalities, but i'm working on that to see if I can get it sorted out :) for now though it's definitely worth considering it beta or maybe even alpha

Would love to keep our communities tightly knit, FOS AI and localllama both have similar ideals coming from two different angles, so keep in touch :D

[–] [email protected] 0 points 2 years ago (2 children)

Hey thanks for the detailed writeup, this is great! Probably worth including a couple of the llama 1 models just because they're more mature and ready to be used even tho licensing is awkward

Also if you'd like I maintain a few docker images for a couple tools (namely oobabooga, koboldcpp, and lollms-webui) that might be good for beginners to get their feet wet, can find them pinned at https://github.com/noneabove1182