Gryphe/Pantheon-Reasoning-27B · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 18 comments

from Gryphe:

An experiment in bringing reasoning capability to the Pantheon roleplay series in the form of an uncensored dense Qwen 3.6 27B. This specific model can be thought of as a successor to both the Pantheon series and the one-time Codex release since I used such a large variety of data this time around.

Yet another theory being tested this time around: take the data that Pantheon is built on, pair it with full thinking traces, and let the model reason its way through character work — weighing tone, planning narrative beats, considering how a character would actually respond before committing to a line. Whether that meaningfully improves roleplay quality over a non-reasoning model is a question you'll hopefully be able to help me answer.

GGUF quants are available here.

Model details

Base model is llmfan46/Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved, and from what I can tell this worked out very, very nicely in regards to refusal reduction and writing capabilities.

I considered Gemma 4 31B but that model has been an absolute pain to train. Something something special snowflake architectures. (grumble, grumble)

All training sources include full reasoning traces, with thinking active across every assistant turn:

Pantheon data (\~28%) - the core Pantheon roleplay corpus with reasoning traces back-generated using the method described below
Opus-4.6-Reasoning-24k (\~21%) - a cleaned and deduplicated aggregation of Claude Opus 4.6 reasoning traces covering general instruction-following, STEM, and coding; provides the broad reasoning backbone
WorldSim data (\~16%) - long-form Opus 4.6 narrative roleplay with native reasoning traces, focusing on extended storytelling, character immersion, and emergent world logic, cobbled together through various experiments - mainly third person present tense but has a bit of everything + cliché cleaned, of course!
Text adventure data (\~16%) - high stakes interactive fiction and text adventure content with reasoning back-generated, lending the model a more grounded, prose-forward writing style
General roleplay data (\~16%) - a broad collection of highly varied roleplay transcripts with reasoning back-generated, helping the model generalise well to arbitrary character setups
Tiamat data (\~3%) - character and roleplay dataset originally built for Tiamat-24B-Magistral, featuring a multi-step generation/extension/improvement pipeline with critic-improver rewrites to reduce AI clichés, with reasoning back-generated for each exchange

The model was trained with preserve_thinking: true, so thinking tags remain active across all assistant turns in multi-turn conversations, not just the first.

[-]

SurpriseOk6927@reddit

reasoning traces for roleplay is a smart approach. long sessions break because models dont track character motivation across turns. curious if the thinking overhead adds noticeable latency

damn thats disappointing. the idea was promising but if it breaks after 5k tokens its useless for long rp sessions. appreciate the honest review saved me the download

inddiepack@reddit

I use the non-hereticised quant of 27B by unsloth(MTP), and it does track the reasoning traces, if you have a system prompt for it.

I have tried this fine tune the OP has posted and, unfortunately, it's very bad. It loses the structure and vision of the system prompt very fast (within 5k tokens), and its feature of being able to track reasoning trances, it's actually broken in comparison with the base unsloth model, as I check the thinking tokens and it's not rigorously filtering a character's answer through its own personality matrix and the conversation so far. Once you pass 10-15k tokens, the reasoning starts being shorter and worse as well.

Although this one is unusable for me, massive appreciation for the Gryphe's work and effort.

damn thats disappointing was hoping itd hold context better have you tried the unsloth base model without the fine tune curious if the falloff is inherent or just this version

jacek2023@reddit (OP)

I didn’t know you had an account here 😄

Kahvana@reddit

Looks really nice! You might want post this on r/SillyTavernAI also.

Gryphe/Pantheon-Reasoning-27B · Hugging Face

Model details

SurpriseOk6927@reddit

SurpriseOk6927@reddit

inddiepack@reddit

SurpriseOk6927@reddit

korino11@reddit

TheRealMasonMac@reddit

korino11@reddit

ComplexType568@reddit

pinkyellowneon@reddit

Iwaku_Real@reddit

PunnyPandora@reddit

LLMFan46@reddit

IrisColt@reddit

Opening-Ad6258@reddit

LLMFan46@reddit

Gryphe@reddit

jacek2023@reddit (OP)

Kahvana@reddit