Current state of open-source ?
Posted by DarkMatter007@reddit | LocalLLaMA | View on Reddit | 10 comments
I’m trying to understand the current open-source LLM landscape beyond surface-level hype.
We all got used to the nerfed products of Claude/Geminj so I believe really in opensource as a solution.
I keep seeing models like GLM, Kimi, MiniMax, DeepSeek, Qwen, Mistral, etc., but it’s honestly hard to tell how they actually compare in practice.
A few things I’m confused about:
- Where does DeepSeek stand right now? It used to be everywhere, now feels less dominant
- GLM / Kimi / MiniMax are these actually toptier or just benchmark for very specific job?
- Are there any real benchmarks people trust (not cherry-picked blog posts)?
What do you guys actually use in production or serious projects?
DepartmentOk9720@reddit
Deepseek is pretty good, it's cheap and can go large volumes
Enough_Big4191@reddit
honestly the landscape looks crowded but in practice most teams converge on a small set based on their workload. deepseek had a moment bc of cost/perf, but consistency and integration matter more over time, so people mix it with qwen or mistral depending on the task. a lot of the others look strong on benchmarks but feel narrow or less predictable in real flows. i’d trust your own evals over public benchmarks. run your actual tasks, long context, tool use, edge cases, and see where it breaks. most “top tier” models look similar until u hit those.
fustercluck6000@reddit
I’ve been very impressed by Qwen3.5-27b, especially the Opus 4.6 distillations which have worked extremely well in production. Open-weight models are advancing a WHOLE lot faster than the blackbox, especially when you consider the difference in inference costs.
Few_Painter_5588@reddit
The Open Weight models are about a year behind the current frontier models. So there's no open weight model that can compete with Claude Opus 4.7 or even Claude Sonnet 4.6. Most Open Weight models are between GPT 5.4 Mini and Claude 4.6 Sonnet.
GLM, Kimi and MiniMax are great models, but they're not frontier models. GLM 5.1 is probably the best Open Weight model.
Behind by quite a bit, but apparently V4 is coming soon™. They updated the model that their API serves though, and have been updating their github repo - so a launch could be imminent.
It depends on the task, a lot of benchmarks have become saturated and everyone is benchmaxxing now. For coding, SWE bench pro is a good indicator and for creative writing, EQBench is a good indicator too.
BidWestern1056@reddit
before kimi-k2.5 i felt there were no serious viable alternatives to models from anthropic/gemini/openai, but now i almost exclusively use kimi, glm-5.1, and minimax-2.7 through ollama cloud with npcsh and incognide
https://github.com/npc-worldwide/npcsh
https://github.com/npc-worldwide/incognide
i've always designed my tools to work with small open-source models too, so even for small qwen models (4b-10b) they can do a decent portion of useful shell tasks. this capability at this lower threshold will continue to improve too, the future is ours, open and local!
Medium_Chemist_4032@reddit
Ive trialled qwen3 122b q4 one day, as a work issued Claude Code replacement. Mostly comprehension of existing legacy code. It served me very well and I can foresee a future, where companies advice on using local models as a Haiku (research agent) or Sonnet replacement.
Mediocre_Doctor4712@reddit
How is it with tool calling ? Sometimes these opensource just dont get that right.
Medium_Chemist_4032@reddit
No issues. Vllm uses the same templates model's author provides. Jinja templates are native to python. I've had a few templating related issues in llama.cpp (with minimax 2.7), but those typically do get sorted out eventually. Ollama also rewrites the jinja templating engine and some small details (like a lack of a newline) tend to slip up. Once those get sorted out, tool calling works across the whole context size.
jacek2023@reddit
Do you ask about:
- models usable locally (on local setup)?
- models usable in cloud (same way as Claude/Gemini but cheaper)?
- hype (benchmarks and clickbaits)?
Because answers will be different
MengerianMango@reddit
deepseek is due for a release soon. currently they're a gen or more behind minimax/glm/kimi. i listed those in increasing order of size and ability. they're all pretty good. glm/kimi are very usable for swe work. minimax feels a bit amateurish to me, but it can sorta do stuff.