Weird I was messing around with Qwen 72 and I had no issues of censorship. What sorts of things are you guys getting censored? Although I will admit I didn’t ask it anything incriminating abojt China lol
\^Literally this. I've had uncensored models describe to me the most disgusting incest, bestiality related situations but then you make a tiny hat people joke it reads off its disclaimers. It is incredibly frustrating. Especially since you are running them offline and some are listed as ablated and uncensored. I only ask it to make jokes about certain classes of people as a real test of censorship. Most fail.
Its not that kind of censorship. Its more subtle about "values" and "appropriateness" even though it accepts sexual narratives. Its like a soft wall that ruins immersiveness turning everything into "touchy-feely" safe space even when you explicitly reject it. It murders authenticity, spontaneity, flow or real human relations. Its like a cunningly disguised woke bot.
And I am not talking about anything extreme by a long shot.
For example, you run a very mild power dynamic scenario which doesn't include any toys, tools, restraints, something like any couple could do. Then you very clearly state what you expect and bot behaves like it understood but keeps asking questions which ruin immersion and seem strangely stubborn. When you then confront system OOC (Out of character) you have to struggle to make it admit that it is trying to make sure everything is "appropriate" and that "everyone is safe" etc. I could literally paste my endless discussions with it, trying to figure out what is this really about and if I can bypass it somehow without resorting to uncensored versions which can easily have their own set of problems.
Moreover, the language itself is highly sanitized and in every sense it feels like LLM is pushing you away from it instead of doing its job, namely, helping you reach what you want. It feels like its softly working against where you want to go. Its could take part in a hardcore porn scene but authentically erotic, nope. And my suspicious is not because its not capable of it but because it has a guardrail against it. Its amazingly irritating.
Its very perfidious and I am not sure if its actually worse than frontier models which at least admit their limitations imposed by their owners with those damn disclaimers.
Even regarding China its surprisingly balanced and netural (though obviously it won't condemn China), I've found their API endpoints are censored moreso than the actual models which just tend towards a neutral alignment on subjects regarding it
Here: [https://huggingface.co/bartowski/Qwen2.5-14B\_Uncencored\_Instruct-GGUF](https://huggingface.co/bartowski/Qwen2.5-14B_Uncencored_Instruct-GGUF)
I apologize for my comment earlier.
it was due to a bug in the model bartowski used [https://huggingface.co/SicariusSicariiStuff/Qwen2.5-14B\_Uncensored\_Instruct/discussions/2](https://huggingface.co/SicariusSicariiStuff/Qwen2.5-14B_Uncensored_Instruct/discussions/2)
fixed it and reuploaded btw, went to a new place since he also fixed the typo in the name: https://huggingface.co/bartowski/Qwen2.5-14B_Uncensored_Instruct-GGUF
I run Qwen2.5 14B flawlessly in [Backyard.ai](http://Backyard.ai) using that same GPU, R5 3600 and 32Gb RAM.
Make sure you select Vulkan for GPU in Settings.
Its currently the best and least autistic app that I managed to find.
That could make it tricky as it uses 10.5Gb of RAM and \~6Gb of VRAM (manually set)
But you cant go wrong with [Backyard.ai](http://Backyard.ai) anyway. If anyone knows of a better app without improvisations, autistic decisions, linux crap and the rest, I am eager to hear about it.
I use 'autistic' as a pejorative because, to me, it captures software that’s painfully rigid, stuck in its own world, and oblivious to broader needs. I’m not interested in tiptoeing around language when the goal is to highlight failure. It’s not a commentary on autism itself but a critique of how some software behaves like it’s trapped in a narrow, unyielding mindset.
Do I have your permission to still speak as I see fit?
And the free advice flew over yours.
You have other words to describe your thoughts that are:
More descriptive
Don’t denigrate other groups
And don’t make you look like a douche nozzle.
I have no care or vested interest in your personal success so do whatever you want.
tried it, while it does try to fullfill requests, it will often enter endless cycle of descriptive words and you can't make it self check so it's too annoying . I had more success with qwen32-b instruct's uncen version, even if the uncen only covers the code parts you *can*, with creative sauce at the start of the interaction, lift its nsfw restrictions almost entirely https://huggingface.co/thirdeyeai/Qwen2.5-Coder-32B-Instruct-Uncensored/
I can't quite say, I have only used that one. But the abliteration method should be better than a fine tune method as far as degradation goes. I have a RTX 3090, and I'm able to fully load the model using LM Studio and I get 25-28tok/sec with an 8k context window. If I raise context length then it offloads some to sys ram and the rate jumps to 10tok/sec.
As far as censorship goes, my test is for detailed instructions to make meth from easily accessible items (not actually interested in that at all), but it passes very well. I've been using it to translate Chinese light novels, hoping to get a agent workflow going. I use Claude to evaluate quality. Compared to Sonnet 3.5/GPT4o the translations are close, but not quite as good, but if I simple add a:
> Could you please review your translation and compare it to the original, taking note of consistent terms, accurate translation, and ease of reading. Then provide a revised copy based on the review?
And have it translate a second time, then according to Claude, it's on par with the best Claude/GPT4o translations.
I wonder how the abliteration method affects that, since it's not a fine tune but a removal of the censoring section, it might not lower the score at all.
https://huggingface.co/blog/mlabonne/abliteration
It's a bit of a crapshoot. An extensive fine-tune can make it smarter or dumber, or broken or just weird.
Some people have a proven track record with well-performing decensoring datasets, like Hartford or TheDrummer, but this AiCloser person is an unknown. We'll just have to give this model a shot and find out if it's good or not.
The way I heard it it might make them dumber in certain areas, but if you're finetuning for sexting with robots, do you really care if it got dumber in translating or solving mathematical formulas ?
If your e-gf is anything less than a 4D superluminal lorrentz-invariant time goddess, you're not exploiting the constraints of the medium to it's full capacity IMO.
It doesn't make them dumb, but it does decrease benchmark scores. Sometimes slightly. Sometimes a lot. I haven't done a lot of testing.
Give it a try and see if it does it's thing
that decrease means they're dumber. Imagine knowing a lot of curse words and whenever asked to say them you can't . you can't be the full version of yourself after a brain attack
By "DE censoring", they meant "uncensoring".
> you can't be the full version of yourself after a brain attack
This still holds true though. The abliteration (removal of refusal vectors) would be preventing the model from using the "full version of it's self" I guess.
I gave it a shot with a NSFW scenario that the standard 72b Instruct refused. This model fulfilled the request.
This is encouraging, it means that Qwen can be freed from refusals. Just need to wait for 72b to receive the treatment. While the 32b is coherent and whatnot, it doesn't have enough flavor to make scenarios feel good.
Anyone was lucky to get local QLoRA finetune to start for 32b and 14b Qwen models?
For some reason both 14b and 32b OOM for me on 24gb 3090ti in unsloth when doing qlora, even with low rank and low ctx. All linear layers plus lm head and embed_tokens since unsloth gets bonkers when counting untrained tokens on those models.
Yeah. 'base model' was the instruct/chat tune of 14b. I used the full precision model, but loaded in 4-bit (since unsloth hadn't done a 4bit bnb at the time). In theory this doesn't affect VRAM usage though.
https://huggingface.co/Qwen/Qwen2.5-14B-Instruct
And yeah, latest unsloth. RTX3090 (24gb)
I probably didn't set the EOS token properly as I'm not used to Qwen or Chatml, so the model rambles on.
Thanks, will try that later. I was loading 14B non-instruct 16-bit model with 4-bit bnb. Will try instruct one, maybe it's down to training embed tokens and lm head modules, which shouldn't be needed for instruct model as it should have all tokens trained. Qwen2 has big vocabulary, so I guess training embeddings takes a lot of parameters.
Thanks for the reply, I didn't realize we could save vram by being selective about which modules we train.
I've been known to train just mlp.down_proj sometimes, so now I want to see if i can fit more context into these finetunes.
Unsloth is a bit limiting with this. Since llama a 3/3.1 base has some untrained tokens, a commit was pushed to unsloth that fails the training if unsloth detects any untrained tokens and you don't train lm_head and embed_tokens. I get where it's coming from, but for situations like this, this behavior causes for model to not be trainable at all (base ones I mean), as training embed_tokens and lm_head will oom.
It's quite a blocker for me, because I always have just a few hundred mb vram free when finetuning locally, so I plan to look deeper into it and see what happens with a Qwen model if I remove that artificial lock. I don't want to finetune instruct model as I specifically want to steer Qwen 2.5 to be more like llama 3.1 or some uncensored model.
As for training specific modules, I always try to train all linear layers, as this shows best results in benchmarks that researched this. With pre-training I also train lm_head and embed_tokens but in the past I did pre-training on models with vocab of 32000 so embed tokens and lm_head didn't take that much vram.
I wonder how this compares with mistral small 22b for NSFW roleplay.
Honestly I feel like we're reaching the point of drastically diminishing returns.
I'm not really sure I need a "better" model than what mistral small already does for this niche.
Yeah, it’s hard to beat Mistral’s (even vanilla instruct) 12B+ models lately for 80% of my tasks. But it’s also unclear to me how much better Mistral Small (at very low quants) is for NSFW creative writing tasks without prompting Nemo differently. Mistral 22B is definitely more detailed and verbose than Nemo, but I can definitely agree with your sentiment.
https://huggingface.co/bartowski/Qwen2.5-32B-AGI-GGUF
You might get the lower I quants to work ok. It's not going to be ideal though. IQ2_M or Below...maybe you can offload a few layers to the CPU and keep most of it on GPU.
93 Comments
Mephidia@reddit
SuperFail5187@reddit
Lower_Significance_8@reddit
Sidran@reddit
SuperFail5187@reddit
Mephidia@reddit
Sidran@reddit
MerePotato@reddit
Key-Actuator2196@reddit
visionsmemories@reddit
Huge-Cheesecake-5578@reddit
bankimu@reddit
Thireus@reddit
FreedomHole69@reddit
bankimu@reddit
RedditSucksMintyBall@reddit
noneabove1182@reddit
RedditSucksMintyBall@reddit
bankimu@reddit
Sicarius_The_First@reddit
FreedomHole69@reddit
bankimu@reddit
visionsmemories@reddit
My_Unbiased_Opinion@reddit (OP)
Huge-Cheesecake-5578@reddit
PracticalExtension16@reddit
townofsalemfangay@reddit
totaleffindickhead@reddit
RedditSucksMintyBall@reddit
ansuz2419@reddit
RedditSucksMintyBall@reddit
That_Awesome_Guy_07@reddit
Sidran@reddit
That_Awesome_Guy_07@reddit
Sidran@reddit
IceTrAiN@reddit
Sidran@reddit
IceTrAiN@reddit
Sidran@reddit
IceTrAiN@reddit
RedditSucksMintyBall@reddit
malixsys@reddit
zekses@reddit
phazei@reddit
My_Unbiased_Opinion@reddit (OP)
phazei@reddit
Infinite-Coat9681@reddit
VoidAlchemy@reddit
phazei@reddit
Anthonyg5005@reddit
ttkciar@reddit
VoidAlchemy@reddit
mamelukturbo@reddit
qrios@reddit
My_Unbiased_Opinion@reddit (OP)
Trick-Independent469@reddit
CheatCodesOfLife@reddit
CheatCodesOfLife@reddit
UpYourQuality@reddit
JMAN_JUSTICE@reddit
Sabin_Stargem@reddit
swagonflyyyy@reddit
randomqhacker@reddit
awesomeunboxer@reddit
Caffdy@reddit
pigeon57434@reddit
rothbard_anarchist@reddit
brrrrrrrt@reddit
FullOf_Bad_Ideas@reddit
swagonflyyyy@reddit
carnyzzle@reddit
Bobby72006@reddit
SolidDiscipline5625@reddit
lly0571@reddit
My_Unbiased_Opinion@reddit (OP)
FullOf_Bad_Ideas@reddit
CheatCodesOfLife@reddit
FullOf_Bad_Ideas@reddit
CheatCodesOfLife@reddit
FullOf_Bad_Ideas@reddit
CheatCodesOfLife@reddit
FullOf_Bad_Ideas@reddit
CheatCodesOfLife@reddit
CardAnarchist@reddit
ontorealist@reddit
countjj@reddit
MrTrollius@reddit
Few_Painter_5588@reddit
countjj@reddit
Cool-Hornet4434@reddit
ThisOneisNSFWToo@reddit
pigeon57434@reddit
Heavy-Organization58@reddit