LM Arena Text Leaderboard: Meta at #4 and GLM 5.1 at #13

Posted by Leafytreedev@reddit | LocalLLaMA | View on Reddit | 11 comments

Meta's finally back on the text leaderboard near the top at #4 although they're no longer open source. Interestingly GLM 5.1 is only at #13 on text whereas on code they're at #3 competing neck and neck with Sonnet 4.6.

What's funny to note is that the American labs have been scoring very well on arena (i.e. Gemma 4) while Chinese labs are performing well on benchmarks (admittedly their scores are self-reported).

Based on these rankings, we're super excited to run GLM 5.1 locally but until Apple comes out with M5 Ultra 512gb+ only those with bank or tinkering knowledge will be able to play with these huge models with hardware off the shelf.

[-]

ttkciar@reddit

"Super excited" is my attitude about GLM-5.1 as well. As for "tinkering knowledge", that's more or less what this sub and r/HomeLab are for.

People who plunge elbows-deep into the technology are of course going to be able to do more with less, compared to people who limit themselves to turn-key COTS technology. That has always been the case, and not just with LLM tech. Knowledge is power.

That having been said, hopefully ZAI comes out with a new Air model (100B'ish total parameters with a hefty number of active parameters) based on GLM-5.x so we have something to run on budget hardware at good speed.

In the meantime, we have Gemma-4-31B, which in my evaluations so far has been superb. I am genuinely astounded at how good it is at everything! Gemma-4-26B-A4B has also exceeded my expectations.

I didn't think Google would give us a model that was as much better than Gemma 3 as Gemma 3 was than Gemma 2, but they appear to have done exactly that, and they fixed their license / terms of use, too!

2026 is shaping up to be a very, very good year for local LLM technology!

[-]

Barry_22@reddit

Is it comparable to Qwen 27B? Specifically for agentic coding

[-]

ttkciar@reddit

I don't know that, yet. I haven't exercised its tool-use at all. Will circle back to this when I do.

[-]

Barry_22@reddit

Thanks, would be great

[-]

ttkciar@reddit

I'm guessing the downvotes are from folks who have been burned by Gemma 4's support bugs and template problems? Or perhaps my genuine excitement is coming across as simping/fanboyism?

[-]

DeepOrangeSky@reddit

Do you know if there is any solution yet for the memory-usage explosion issue of Gemma4, if I am using it on LM Studio?

As in, this issue: https://www.reddit.com/r/LocalLLaMA/comments/1sdqvbd/llamacpp_gemma_4_using_up_all_system_ram_on/?utm_source=reddit&utm_medium=usertext&utm_name=LocalLLaMA

which people said can be solved with: --cache-ram 0 --ctx-checkpoints 1

I'm a noob and don't know where to put that/what to do with it. Can I even use it to fix the issue in LM Studio, or only on llama.cpp (which I've never used before/don't know how to use yet)? Do I put it in a Jinja or a JSON file or something, or is there some way to put it in a command line somewhere to get Gemma4 to work properly?

As it stands so far, if I (or anyone else, it seems) uses Gemma for more than a few replies and more than a few thousand tokens of conversation length on LM Studio, the memory usage just balloons basically to infinity and uses up all your memory no matter how much memory you have.

So far the only solution I've come up with has been to just eject the model and reload it after every single reply. But, obviously that's not a very good solution, so, so far I can't really use Gemma4 31b on LM Studio it seems :(

[-]

ttkciar@reddit

I'm sorry :-( I have no experience or familiarity with LM Studio.

For what it's worth, that problem does not happen with "bare" llama.cpp.

[-]

DeepOrangeSky@reddit

Yea. Well, I guess if it is still on going in a few more days or a week or so maybe I will make another thread about it. I mean, it seems kind of crazy to me given how many tens of millions of people use it and use LM Studio, if it is still just some unresolved issue where nobody can really use it on LM Studio yet. Like, if I was either Google, or LM Studio, I would think I would want to find a way to fix it at some point.

[-]

Leafytreedev@reddit (OP)

I didn't want to say anything before but your formatting and Gemma stanning were a bit sus lol

[-]

ttkciar@reddit

Thanks. I'd rather get blunt, honest criticism than silent, mysterious downvotes :-)

[-]

Daemontatox@reddit

Ahhh this is gonna be like Gemma 4 when everyone got super excited , said its the best ever , omg it so better than qwen and literally one week later the sub is full of gemma4 hallucinations and template issues and how its so bad 😂😂😂