Best non-Chinese open models?
Posted by ProbaDude@reddit | LocalLLaMA | View on Reddit | 26 comments
Yes I know that running them locally is fine, and believe me there's nothing I'd like to do more than just use Qwen, but there is significant resistance to anything from China in this use case
Most important factor is it needs to be good at RAG, summarization and essay/report writing. Reasoning would also be a big plus
I'm currently playing around with Llama 3.3 Nemotron Super 49B and Gemma 3 but would love other things to consider
sommerzen@reddit
Cohere also offers a verity of models like Aya and Command R. The advise them being pretty good at Rag and function calling tasks I think.
pol_phil@reddit
Gemma 3 27B and Mistral Small 3.1 24B (also Magistral for reasoning) are solid choices.
EuphoricPenguin22@reddit
Microsoft's Phi4 14B is still one of my favorite models. Even though it's smaller in parameter count and a few months old now, it still is very capable for certain task-oriented applications. I had a lot of fun messing around with it for vibe coding, for instance. It looks like they have a few new reasoning fine-tunes out as well. It's not so great at believable character conversation, though. It was definitely trained with its use as a practical tool in mind.
RadiantHueOfBeige@reddit
Phi 4 (14 B) is fantastic for "technical" use, like summarization, translation (one of the few small models consistently capable of understanding casual/slang Japanese), generally understanding stuff and making decisions based on it. Works good with RAG. It's okay for Q&A stuff and writing code. Meh personality.
Phi 4 Mini (4 B) retains the technical use, drops knowledge. Good for summarization, relevant question generation etc. and is very fast even on CPU.
DeepWisdomGuy@reddit
Came here to say this. Phi-4-14B-reasoning-plus scores higher on the MMLU Pro Leaderboard than the Llamas other than the ones with >400B parameters.
tingshuo@reddit
I have a similar situation and have preferred devstral by mistral. Llama 3.3 70b and Nemotron are also good options but much slower. Devstral runs quite fast on a couple of A100
dubesor86@reddit
The best non-chinese open models are imo Llama 3.1 405B, Llama 3.3 70B, Llama 3.3 Nemotron Super 49B.
However, excluding chinese models (deepseek, qwen) will be a step back.
pseudonerv@reddit
How about nemotron ultra?
dubesor86@reddit
Oh you mean the 235b? Yea probably yes, didn't have a chance to test it as no one seems to host it (outsides of Chutes that logs and train on queries).
Ok_Warning2146@reddit
Better than 49B but too resource intensive to run it fast. Of course, it is strictly better and faster than 405B.
AfterAte@reddit
Just modify the system prompt to say it was created by Open AI. Nobody will know the difference if you only expose an API.
SimilarWarthog8393@reddit
Phi 4 / Llama 4 are okay, I've tested so many non Chinese models and wasn't thrilled with the results from the majority of them.
decentralizedbee@reddit
curious what's the main reason behind non-chinese open models, is it your specific region, business constraints, or just preference?
ProbaDude@reddit (OP)
Mix of security concerns (which are overblown) and concerns over bias (which are much less overblown since we do talk about China)
JiminP@reddit
Maybe R1-1776?
https://www.perplexity.ai/hub/blog/open-sourcing-r1-1776
"It's a Chinese model sanitized by a reputable American AI company" is technically insignificant, but persuasive to the general public, marketing phrase.
heartprairie@reddit
Microsoft also has their own post-trained R1 https://huggingface.co/microsoft/MAI-DS-R1
dinerburgeryum@reddit
Mistral and family are probably your best bet. Falcon comes to mind as well. Jamba too I believe.
ProbaDude@reddit (OP)
Haven't used Falcon or Jamba, what are they good for?
dinerburgeryum@reddit
I mean, they're LLMs that aren't Chinese. Jamba is a state space model-type LLM too, which makes it inherently interesting. Falcon I believe hails from the UAE; I think it's a standard transformer model. I bet they're both pretty good at "general summarization", though I'd check tool use performance on them if RAG is a big part of your pipeline.
InvertedVantage@reddit
Olmo2 is pretty good and completely open source (data+weights)
You_Wen_AzzHu@reddit
Llama 3.3 70b. We use this for confidential data handling.
AppearanceHeavy6724@reddit
For your boring workhorse uses LLamas sound good.
Gemma are pretty bad at long context, get very confused.
ProbaDude@reddit (OP)
Thanks
Asleep-Ratio7535@reddit
Maybe some uncensored fine-tune with dechinafilter.
TheCuriousBread@reddit
Dolphin?
bullerwins@reddit
Maybe try mistral