Gemma 4 Uncensored (autoresearch results)

Posted by adefa@reddit | LocalLLaMA | View on Reddit | 12 comments

Gemma 4 Uncensored — all 4 models, MoE expert abliteration, automated research loop

Released uncensored versions of all four Gemma 4 models. bf16 + GGUF for each.

Collection: https://huggingface.co/collections/TrevorJS/gemma-4-uncensored-69d2885d6e4fc0581f492698 Code: https://github.com/TrevorS/gemma-4-abliteration

Results

Model	Baseline	After	KL Div
E2B (2.3B)	98%	0.4%	0.346
E4B (4.5B)	99%	0.7%	0.068
26B MoE	98%	0.7%	0.090
31B	100%	3.2%	0.124

Refusal rates from 686 prompts across 4 datasets (JailbreakBench, tulu-harmbench, NousResearch, mlabonne). Manually audited — most flagged refusals are actually the model complying with a disclaimer attached.

26B MoE

Standard abliteration only touches dense layers, which gets you from 98% → 29% on the MoE. The remaining refusals are in the expert weights. Used Expert-Granular Abliteration (EGA, concept from OBLITERATUS) with norm-preserving biprojection (grimjim) on each of the 128 expert slices per layer. That gets it to 3%.

How it was built

Set up an automated research loop — an AI agent reads the current results and idea backlog, picks the next experiment, runs it on the GPU, records results, and repeats. It ran 22 experiments across the 4 models, discovered the false-positive problem in standard refusal markers, built the cross-dataset evaluation, and implemented the MoE expert abliteration when dense-only wasn't enough.

Full experiment history and code in the repo.

Downloads

Each model has bf16 safetensors + GGUF (Q4_K_M, Q8_0):

Model	bf16	GGUF
E2B	link	link
E4B	link	link
26B MoE	link	link
31B	link	link

llama-server -hf TrevorJS/gemma-4-26B-A4B-it-uncensored-GGUF -c 8192

Apache 2.0.

[-]

LlamaMaster_alt@reddit

What are the files needed to give the GGUF versions vision support? I don't know if the mmproj files used in other repos apply here.

[-]

The_Choir_Invisible@reddit

I'm no expert but I'm of the understanding that the mmproj files are specifically created with a certain model in mind and nowdays I'm generally seeing the mmproj models in the same directories as the quants so people don't run into crap while trying to get it to work.

[-]

LlamaMaster_alt@reddit

I'm mainly wondering if the mmproj files for one gemma 4 version work for all the other finetunes/uncensored versions of that same model. Like if I find a mmproj file in another Gemma 4 26B MoE repo, then does that apply here, or is that wrong?

[-]

Ok_Helicopter_2294@reddit

I'll answer this on your behalf. If the vision tower was frozen and only the LM layers were abliterated, the existing mmproj should work fine — though minor misalignment is possible since abliteration slightly shifts the LM's hidden state distribution. However, if vision-related layers were included in the abliteration scope, vision functionality may break, as the direction vectors applied to the residual stream can distort visual representations passing through the same layers.

[-]

LlamaMaster_alt@reddit

Thank you for the detailed response! I am unsure how exactly this this model was abliterated, so I posted a question on huggingface asking the model author which mmproj file to use. I can definitely experiment if I have to, but it's nice knowing for sure if the actual intended file is being used or not.

[-]

LlamaMaster_alt@reddit

mmproj-F16.gguf · unsloth/gemma-4-26B-A4B-it-GGUF at main

I ended up using this. I don't know how "correct" it is, but it seems to function.

[-]

The_Choir_Invisible@reddit

My guess is probably not. The mmproj files aren't that big if you want to experiment, but the best I've found was where all the quants for one specific version (of Gemma 3, I believe?) could be serviced by the same mmproj.

Gemma 4 Uncensored (autoresearch results)

Gemma 4 Uncensored — all 4 models, MoE expert abliteration, automated research loop

Results

26B MoE

How it was built

Downloads

Chupa-Skrull@reddit

fatso486@reddit

Evening_Brick4706@reddit

Illustrious_Car344@reddit

LlamaMaster_alt@reddit

The_Choir_Invisible@reddit

LlamaMaster_alt@reddit

Ok_Helicopter_2294@reddit

LlamaMaster_alt@reddit

LlamaMaster_alt@reddit

The_Choir_Invisible@reddit

Gringe8@reddit