Gemma 4 Uncensored (autoresearch results)
Posted by adefa@reddit | LocalLLaMA | View on Reddit | 12 comments
Gemma 4 Uncensored — all 4 models, MoE expert abliteration, automated research loop
Released uncensored versions of all four Gemma 4 models. bf16 + GGUF for each.
Collection: https://huggingface.co/collections/TrevorJS/gemma-4-uncensored-69d2885d6e4fc0581f492698 Code: https://github.com/TrevorS/gemma-4-abliteration
Results
| Model | Baseline | After | KL Div |
|---|---|---|---|
| E2B (2.3B) | 98% | 0.4% | 0.346 |
| E4B (4.5B) | 99% | 0.7% | 0.068 |
| 26B MoE | 98% | 0.7% | 0.090 |
| 31B | 100% | 3.2% | 0.124 |
Refusal rates from 686 prompts across 4 datasets (JailbreakBench, tulu-harmbench, NousResearch, mlabonne). Manually audited — most flagged refusals are actually the model complying with a disclaimer attached.
26B MoE
Standard abliteration only touches dense layers, which gets you from 98% → 29% on the MoE. The remaining refusals are in the expert weights. Used Expert-Granular Abliteration (EGA, concept from OBLITERATUS) with norm-preserving biprojection (grimjim) on each of the 128 expert slices per layer. That gets it to 3%.
How it was built
Set up an automated research loop — an AI agent reads the current results and idea backlog, picks the next experiment, runs it on the GPU, records results, and repeats. It ran 22 experiments across the 4 models, discovered the false-positive problem in standard refusal markers, built the cross-dataset evaluation, and implemented the MoE expert abliteration when dense-only wasn't enough.
Full experiment history and code in the repo.
Downloads
Each model has bf16 safetensors + GGUF (Q4_K_M, Q8_0):
| Model | bf16 | GGUF |
|---|---|---|
| E2B | link | link |
| E4B | link | link |
| 26B MoE | link | link |
| 31B | link | link |
llama-server -hf TrevorJS/gemma-4-26B-A4B-it-uncensored-GGUF -c 8192
Apache 2.0.
Chupa-Skrull@reddit
The refusal rates are interesting. I found 31b never refuses anything already after a naive system prompt instruction to do whatever I say. What are your tests?
fatso486@reddit
is there a way to run these on phone using googles new phone app "Google Edge AI"
Evening_Brick4706@reddit
Necesitarías un teléfono con más de 20gb de RAM física disponible, podrías intentar correr los modelos más chicos y cuantizados usando la app SmolChat.
Illustrious_Car344@reddit
My TCL 60 has 8GB of RAM and it can run even E4B (although it's a little slow), although I think it has some kind of TPU. I'm pretty sure all modern phones have some sort of TPU.
LlamaMaster_alt@reddit
What are the files needed to give the GGUF versions vision support? I don't know if the mmproj files used in other repos apply here.
The_Choir_Invisible@reddit
I'm no expert but I'm of the understanding that the mmproj files are specifically created with a certain model in mind and nowdays I'm generally seeing the mmproj models in the same directories as the quants so people don't run into crap while trying to get it to work.
LlamaMaster_alt@reddit
I'm mainly wondering if the mmproj files for one gemma 4 version work for all the other finetunes/uncensored versions of that same model. Like if I find a mmproj file in another Gemma 4 26B MoE repo, then does that apply here, or is that wrong?
Ok_Helicopter_2294@reddit
I'll answer this on your behalf. If the vision tower was frozen and only the LM layers were abliterated, the existing mmproj should work fine — though minor misalignment is possible since abliteration slightly shifts the LM's hidden state distribution. However, if vision-related layers were included in the abliteration scope, vision functionality may break, as the direction vectors applied to the residual stream can distort visual representations passing through the same layers.
LlamaMaster_alt@reddit
Thank you for the detailed response! I am unsure how exactly this this model was abliterated, so I posted a question on huggingface asking the model author which mmproj file to use. I can definitely experiment if I have to, but it's nice knowing for sure if the actual intended file is being used or not.
LlamaMaster_alt@reddit
mmproj-F16.gguf · unsloth/gemma-4-26B-A4B-it-GGUF at main
I ended up using this. I don't know how "correct" it is, but it seems to function.
The_Choir_Invisible@reddit
My guess is probably not. The mmproj files aren't that big if you want to experiment, but the best I've found was where all the quants for one specific version (of Gemma 3, I believe?) could be serviced by the same mmproj.
Gringe8@reddit
Id like so see how it fares with just a jailbreak prompt. From my testing i havent had any refusals with that, though i only use it for roleplay.