What’s the point of smaller models?
Posted by ControversialBent@reddit | LocalLLaMA | View on Reddit | 15 comments
What are their use cases?
Posted by ControversialBent@reddit | LocalLLaMA | View on Reddit | 15 comments
What are their use cases?
gxvingates@reddit
People downvoting genuine question posts, truly superior humans
Rude_Yoghurt_8093@reddit
I’m from Frankfurt and this answer checks out
lxgrf@reddit
Sentiment analysis, RAG queries, summarising or reformatting. Nothing involving general knowledge, as you’ve shown
ZenaMeTepe@reddit
But you would still want a model that does logic well, even if you injected a big prompt with all the data and ask it to do something to it. Are 1B models really a good pick for that?
lxgrf@reddit
For a lot of basic tasks a 1B model will be almost as good as a larger one but at a fraction of the cost and ten times the speed.
If the speed doesn’t matter to you and you have the hardware bandwidth or budget, then yes, use the larger model, why not, but when you hit scale the savings from not using a more powerful model than you need can be substantial.
redditorialy_retard@reddit
exactly. small models should be paired with RAG or search if you need queries
ZenaMeTepe@reddit
World knowledge requires way more params.
Sad_Amphibian_2311@reddit
Telling lies, obviously.
RecognitionOwn4214@reddit
If you assume an LLM to be a knowledge model you're on a dangerous track...
VoiceApprehensive893@reddit
speculative decoding
Adorable_Ice_2963@reddit
Wouldnt it be smarter to give models a ton of tools (like databases, calculation tools, ect) and train them to use and combine them instead of training them on any knowledge?
redditorialy_retard@reddit
simple tasks like do X or Y
journalofassociation@reddit
Not world knowledge.
LoSboccacc@reddit
Fine tuning into task models
Nexter92@reddit
Convert file to json for automation, local resume of an surveillance camera, freedom of use.