What's the missing piece in the LLaMA ecosystem right now?
Posted by Street-Lie-2584@reddit | LocalLLaMA | View on Reddit | 32 comments
The LLaMA model ecosystem is exploding with new variants and fine-tunes.
But what's the biggest gap or most underdeveloped area still holding it back?
For me, it's the data prep and annotation tools. The models are getting powerful, but cleaning and structuring quality training data for fine-tuning is still a major, manual bottleneck.
What do you think is the most missing piece?
Better/easier fine-tuning tools?
More accessible hardware solutions?
Something else entirely?
l33t-Mt@reddit
Temporal contiguousness.
uutnt@reddit
A better audio transcription model that rivals Whisper 2/3. Not enough players in this space
therealAtten@reddit
I think the biggest missing piece is the interplay of tools in the ecosystem itself. I think one day humanity will outdate MoE models in favour for dense models with better tool calling and instruction following. I believe once we fully accept that models shouldn't store information, but should be trained on rationale, logic and reasoning, as well as adding tokens that lead to the "100 most ubiquitous tools", we will see a huge improvement in overall performance. The task of an LLM should be to orchestrate, break down the user request into N = PN and make use of a smaller dense model speed advantage. You will get much higher quality results with much lower hardware requirements.
woahdudee2a@reddit
this has been proving wrong time and time again. there is no reasoning without knowledge
cornucopea@reddit
Knowlege vs. excessive knowledge, for the purpose reasoning is what set the differencce. As illustrted by no free lunch theroem, knowledge costs. Get the priority straight often pays.
cornucopea@reddit
Basically what I responded a post here two weeks ago regarding "world knowledge" vs. "sheer smart" when comparing some of the local models.
stoppableDissolution@reddit
But but but bitter lesson /s
(I totally agree)
therealAtten@reddit
Edit: basicall exactly what the Stanford post from today is describing...
___positive___@reddit
Standardized local tool set. It is nonexistent. Cloud models use search, for example, to greatly improve knowledge. A truly local search tool would involve something like a wikipedia zim file adapted for llm lookup.
huzbum@reddit
A GUI that doesn’t involve docker, pip, or venv or whatever. Just install like a real program already. That’s why I went to LM Studio.
I stayed for the selection of quants and ability to easily configure parameters.
segmond@reddit
local models are falling behind in agents
Ok-Hawk-5828@reddit
Lack of meaningful multimodal context in the GGUF hemisphere.
createthiscom@reddit
I’m going to say “developers” for edge devices, but I don’t think that will be a problem much longer. The frontier AIs have already almost surpassed senior dev capability. Almost.
HasGreatVocabulary@reddit
probably this https://discrete-distribution-networks.github.io/
MaxKruse96@reddit
Training-Data is the biggest issue for local ecosystem right now i think. There is so many datasets, but who knows about their real quality.
For me personally, finetuning an LLM is like 500x harder than a diffusion model, simply due to the lack of tooling. Unsloth is nice and all, but i dont want to run fucking Jupyter Notebooks, i want something akin to kohya_ss with as many of the relevant hyperparameters exposed.
Hardware accessibility is only secondary. If you have a small Model, e.g. the Qwen3 0.6B full finetune should be possible on local hardware. If that proves to be effective, renting a GPU machine somewhere for a few bucks shouldnt be the issue.
Fuzzdump@reddit
Something like Kiln?
MaxKruse96@reddit
as per https://docs.kiln.tech/docs/fine-tuning-guide#step-6-optional-training-on-your-own-infrastructure it just forwards to unslothe etc, so no it doesnt do anything that was mentioned here.
yoracale@reddit
We're working on a GUI actually! Will be out within the next 3 months! :)
And yes, there will be advanced settings to expose all the hyper-parameters and more. If you have any other suggestions, please please let us know since we're still building it!
MaxKruse96@reddit
Good to hear! (Coping for that christmas gift of yall)
Honestly i dont have any suggestions outside of "kohya_ss is pretty idiot proof with presets and general guides on steps, learning rate etc". Descriptions for sane values for anything would be great, but i can imagine in the LLM space that it might become less cookie-cutter than in stable diffusion.
sqli@reddit
I wrote a suite of small Rust tools that finally allowed me to automate dataset creation.
https://github.com/graves/awful_book_sanitizer https://github.com/graves/awful_knowledge_synthesizer https://github.com/graves/awful_dataset_builder
Each project consumes the output of the previous. The prompts are mangled with yaml files. Hope it helps, lmk if you have any questions.
Zor25@reddit
It looks awful
lumos675@reddit
Exactly as you said.. it's more than 14 days i am trying to make a dataset for persian language so i can train my tts model. I tried even gemini pro and it's not capable to do the task since none of the models has good understanding of persian language. I tried all llama based and local models like gemma and others as well. None of them are capable of this task. If we will be able to focus first on making datasets faster then we can make almost anything. Imagine if you have a good tts model on every language which is stable. And then a model to create text for you in other languages. Then you can train almost anything which you need as fast as few clicks. So yeah you are totally right
therealAtten@reddit
Have you had a look at Mistral Saba to help you out? Not exactly sure if that does what you need
lumos675@reddit
Yeah i tried but it has less knowledge on persian compare to gemini pro 2.5. I think gemini 3 has to be the holy grail though.
-p-e-w-@reddit
Benchmarks. There has been little to no progress in the past two years regarding how LLMs are evaluated. It’s still mostly huge catalogues of questions with predetermined answers. That’s a very poor system for testing intelligence.
BuildAQuad@reddit
It's a hard problem tho, don't think there will be any easy solutions here.
phree_radical@reddit
Models fine-tuned exclusively to follow examples from a strict many-shot format, while definitely not following instructions. It would be strong because base models already show strong few-shot capabilities. It would be straightforward because of the possibility to turn available datasets into endless many-shots. And 1000% necessary because you want to avoid prompt injection.
YouAreRight007@reddit
Training data prep tools.
I'm working on my own tooling and a pipeline that transfers all domain knowledge from a source document to a model.
Once I'm done, I should be able to automate this process for specific types of documents saving loads of time.
The challenge bit is the time spent automating every decision you would normally make while compiling good data.
Iory1998@reddit
What's missing in LLaMA is a new LLaMA model.
One_Long_996@reddit
llms are very bad at image recognition, give it a civ or other strategy game screenshot and it gets nearly everything wrong.
AXYZE8@reddit
I've recommended local LLMs (Tiger/Amoral Gemma 2/3) to 6 friends so far and they all have same issue - app. Both LM Studio and Jan.ai are easy to understand, they like the quality of these models, but they want to have the same history of conversations on phone. Neither app allows this.
To fix that they would need to play with CLI (either llama.cpp built-in web server or Open WebUI) and then Tailscale. This is overwhelming and I think this is a big gap where non-technical people just revert back to closed LLMs.
I've thought about making such one click in GUI online solution for llama.cpp web-server by using Cloudflare Tunnel (works like Tailscale) and Clouflare Zero Trust (auth) and maybe I'll actually do it when I'll have more free time next month.
AXYZE8@reddit
I'm not sure if thats the kind of answer you expected, but as an UX guy I'm focusing on these things in ecosystem.