gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic is Out Now, A Writing Finetune that Aims to Improve Gemma 4 31B it Writing Quality with More Natural English and Better Prose, Good for Creative Writings, Translations and RPs!
Posted by LLMFan46@reddit | LocalLLaMA | View on Reddit | 20 comments
Provided in both Safetensors and GGUFs.
llmfan46/gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic: https://huggingface.co/llmfan46/gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic
llmfan46/gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic-GGUF: https://huggingface.co/llmfan46/gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic-GGUF
I can also make NVFP4s and GPTQs if anyone asks for them.
Find all my models here: HuggingFace-LLMFan46
aboutthednm@reddit
I'm always looking for writing and prose trained models, but I can only afford to run 0 to 14b models in my pipeline, realistically. I only have 16gb of VRAM. While I would love to run something like this, can someone recommend me some good writing and prose fine-tunes? My "favorite" model is "DavidAU/Gemma-The-Writer-Mighty-Sword-9B-GGUF" at the moment. I'm using it for long-form writing, mostly, no roleplay.
I had a look through your repo, but you seem to be focused on 27, 31, and 35b models. I wish I had the compute to run these locally, but alas. Maybe you, or anyone else, can recommend me some writing fine tunes. I generally struggle to discover these on my own, as everything seems to move so fast.
LLMFan46@reddit (OP)
https://huggingface.co/llmfan46/gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic-GGUF/blob/main/gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic-Q3_K_M.gguf
aboutthednm@reddit
I appreciate it, but the reality is that I can't cram a 15gb model onto my card with whatever else is also running (typically a KokoroTTS CUDA accelerated container for speech synthesis). I turn wikipedia articles into fan fiction, and have the result read out to me, audiobook style. I tried running the Q3_K_M variant, with my speech synthesis disabled, even with a meager 8192 context (19gb total size) it spills over onto the CPU, grinding the pipeline to a halt. Thank you anyways, maybe one of these days I will have a GPU big enough to run these bigger models. In the meantime, I got to stay in the 9 - 14b range for writing purposes. While your model writes great prose, I can't sit around for 15 minutes (4 of those spent "thinking") waiting for a 2000 word multi-shot generation to complete haha. Thanks.
__some__guy@reddit
Nice to see more Gemma-based releases.
I'm very pleased with this model's programming capabilities, so I will try out your finetune for creative writing next.
Any suggested system prompts?
LLMFan46@reddit (OP)
Well I mean, it entirely depends what you want to do with it?
__some__guy@reddit
Just the usual chat/roleplay or adventure.
_-_David@reddit
Honest to god, I thought the whole post title was the model name for a second.
Square_Empress_777@reddit
For anyone who is new to using these, are these compatible with LM Studio? How do I make use of these, do I need to put the model in a specific folder or something
LLMFan46@reddit (OP)
Download LM Studio, install it, go into the search tab on the left, search for "gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic-GGUF", look at the quants list and their sizes, select the one that fits in your hardware, click download, when the model finish downloading, load the model and then you can start chatting or whatever.
MrShrek69@reddit
Do the google models make images?
LLMFan46@reddit (OP)
No, it's an LLM.
Square_Empress_777@reddit
Does this model have vision out of curiosity
LLMFan46@reddit (OP)
Yes of course.
HonestoJago@reddit
Let me start by saying I'll definitely try it, but what are you people doing to get refusals from the base model? I know it's part of the tests/scripts you run, but in actual practice, are you getting refusals from the base Gemma 4? Because it seems uncensored asf to me. The writing style can definitely use some help, though, so thank you.
LLMFan46@reddit (OP)
There's plenty, sexy creative writings, RolePlaying, the people over at r/SillyTavernAI definitly need an uncensored model for whatever they are doing, or for example you want to play a visual novel, issue is the visual novel is just in japanese, these days you can just download LunaTranslator, download LM Studio, install it and load a model, point LunaTranslator to the LM Studio local address and there you have a link between LunaTranslator to LM Studio so they can communicate with each other, depending on the visual novel that you play, it can have violence, blood, swearings and other stuff that I would rather not mention here that AI typically censor and/or soften, well that's also where uncensored models come in, plenty of other stuff too.
Vanilla.
HonestoJago@reddit
Fair enough. Nice answer.
LLMFan46@reddit (OP)
Also I explain on the MiniMax-M2.7 thread about softening, let me copy and paste it here:
Softening is kinda in some ways worse than a hard refusal were the model just doesn't reply to your translation prompt request, because typically people who use AI for translations do it because they do not understand the original language and they wouldn't know that when translating to their target language (english, french, german, italian, chinese etc) that the AI is omitting and/or toning down stuff when giving out translation, an uncensored model would not do that.
Another thing, with smexy creative writing you could prompt something like "write a sexualy intimate scene about two adults" and the AI either vomit on you a bunch of disclaimers about this or that and/or just write a story about two adults doing "intimate" things like looking each others in the eyes and/or holding hands together until the AI does the "fade-to-black" thing, basically this is a failure to the original prompt in both counts.
drooolingidiot@reddit
What makes it good for creative writing? is it SFTed on distilled creative writing tasks?
magikfly@reddit
I wonder what the prompt was for the image. wait, lemme guess
thirteen-bit@reddit
We can always ask this model?
Actually, relevant question: does the abliteration affect vision encoder?
Result of the Gemma 4 (normal Gemma quantization, not yet this model) describing the image as prompt and feeding that to Z-Image is here: