Best Use Cases for Small LLMs
Posted by XhoniShollaj@reddit | LocalLLaMA | View on Reddit | 25 comments
Would love to see what the community has been working on and share their experience or use case for small LLMs or VLMs (1b-7b models).
FineCradle@reddit
Looking at these applications make me question the need for small models
OnyxOrator@reddit
Summarization, grammar correction, writing style changes and formatting, basic code completion
Fit_Flower_8982@reddit
I actually tried to use it as a spelling and grammar checker with llama 1B, I was hoping that for something so simple there wouldn't be much difference between models, but after some tests it worked noticeably worse than with 70B. I wouldn't trust it enough not to monitor it closely.
Apart_Boat9666@reddit
To be honest, you can rewrite code with minor changes, add comments, and organize code. In a small context like a function, they work absolutely fine. Using the qwen model, they seem to be really great at it.
XhoniShollaj@reddit (OP)
Thank you for the input!
brotie@reddit
Tasks in general. No reason to call openai to ask it to summarize the name of a chat for the sidebar, generate an optimal search query to pass to a rag function or basic tts etc cut out the latency and pay nothing
yukiarimo@reddit
Not the model size matter, but its user
Temp_Placeholder@reddit
Rewriting (everyone else's) reddit posts so they have better grammar and aren't full of asshole.
Derefringence@reddit
Email rewriting, translation, spell checking, simple data organization
dreamfoilcreations@reddit
I've created a app for data extraction / validation of large amount of files, it can extract information that might be hard using parsers because the content of the files isn't well structured. (using 7b model).
Namarrus@reddit
Which RAG approach do you use and how does your Similarity Search find exactly the right content with such a large number of texts?
dreamfoilcreations@reddit
The files I use fit the context, but for long files, I process in chunks. Usually, the information is close.
GTHell@reddit
I can extract NER, detect language and sentiment with a 0.5b model into a JSON structure format.
That’s the best use cases for me so far as I dont need to fine tune BERT and it work on multilingual language.
I was using Qwen2.5 0.5b instruct by the way.
_donau_@reddit
Converting weird dates to iso format when dateparser couldn't handle the heat
synw_@reddit
Onboarding colleagues with no gpu in local ai with 0.5 to 3b models that work even on an old potato laptop.
We now have many small models that can be efficient in an area or another, which was not the case like 6 month ago. I use different models depending on the task: summarization, translation, chat with documentation or article, code gen
msbeaute00000001@reddit
What size of model are you using for translation and code gen?
synw_@reddit
These can run in a no gpu environment. Like everyone here I use bigger ones on my gpu for more complex code tasks or precise translations, but I tend to use the small ones more and more for their speed, playing nice in my everyday workflow
jfufufj@reddit
I've been using llama3.2 3b for generating large quantity of image prompt, works like a charm.
Content-Ad7867@reddit
Sentiment analysis, data labeling
ChengliChengbao@reddit
summarization mostly, as im able to run them at huge context lengths
quiteconfused1@reddit
.... disconnect from the internet.
p.s. "small large language models" is an oxymoron.
ThinkExtension2328@reddit
It’s not the size that counts but how you use it
XhoniShollaj@reddit (OP)
Lol fair observation 🤣
matt23458798@reddit
Many different use cases, specifically simple daily tasks
XhoniShollaj@reddit (OP)
Care to elaborate? Curious how you apply them, and if they reliable enough for your use case.