What are your opinions on the SuperGemma finetune?

Posted by Kodix@reddit | LocalLLaMA | View on Reddit | 7 comments

So, I'm relatively new to the scene and I kind of want to do a sanity check.
I've been using gemma-4-26B. Been loving it except for the tool calling unreliability. |

I encountered this fine tune called SuperGemma, that claims to fix these issues.

https://huggingface.co/Jiunsong/supergemma4-26b-abliterated-multimodal

.. But when I run it, it's just.. broken. Often it gives blank responses without an end token. Often it seems to just say things completely unrelated to what I say. The responses that it does seem to give correctly are extremely terse, even when instructed otherwise.

Generally, it's a mess.

So my question is: What the hell? Am I doing this wrong, somehow? This happens even when I run it with the exact settings recommended in the model card. And, obviously, I've had a lot of success with other gemma-4 models.

So is there something I'm missing, here? Please, do tell me if this model works fine for you.

[-]

Iory1998@reddit

Hey, piece of advice: Avoid the abliterated models if you need tool calling, math, or coding. They are unreliable. Use the Vanilla Gemma. Also, redaownload the latest quant updates from reliable providers like Bartowski or unsloth. Earlier quants had bugs that were recently fixed.

If you want the absolute best in the Gemma-4 family, use the 31B, Even at Q4, it's better than the 26B-A4B.

LeRobber@reddit

31B and 26B are different beasts, 31b is dense and slow, 26B is fast and light. 26B works for many things great and is responsive.

Well, i totally get it.

abnormal_human@reddit

I have an eval suite that I run for an agent I am developing. I’ve repeatedly see abliterated models outperform by a few percentage points. Strongest effect with gpt oss 120b.

Kodix@reddit (OP)

Appreciate it. Unfortunately, at the quants I am able to use, Gemma 31B is worse for me, overall.

But honestly it's *good enough* for the time being. I am more flabbergasted at the general state of huggingface this implies, though, and I just want a sanity check from people in the community: broken models legitimately get uploaded and benchmarked and touted as superior? If so, dear God, *why*?!

sn2006gy@reddit

A huge chunk of people running locallama's are doing so to have virtual girlfiends and sex chats that's why.

jacek2023@reddit

There are many finetunes, try other ones