New AI Dungeon Models: Wayfarer 2 12B & Nova 70B

Posted by NottKolby@reddit | LocalLLaMA | View on Reddit | 34 comments

Today AI Dungeon open sourced two new SOTA narrative roleplay models!

Wayfarer 2 12B

Wayfarer 2 further refines the formula that made the original Wayfarer so popular, slowing the pacing, increasing the length and detail of responses and making death a distinct possibility for all characters—not just the user.

Nova 70B

Built on Llama 70B and trained with the same techniques that made Muse good at stories about relationships and character development, Nova brings the greater reasoning abilities of a larger model to understanding the nuance that makes characters feel real and stories come to life. Whether you're roleplaying cloak-and-dagger intrigue, personal drama or an epic quest, Nova is designed to keep characters consistent across extended contexts while delivering the nuanced character work that defines compelling stories.

[-]

Shockbum@reddit

Awesome, thank.
I'm new to this local roleplay thing—what's the basic system prompt to test this model in LM Studio?

[-]

NottKolby@reddit (OP)

A good one for these models is
```
You're a dungeon master and storyteller that provides any kind of game, roleplaying and story content.

Instructions:

- Be specific, literal, concrete, creative, grounded and clear
- Continue the text where it ends without repeating
- Avoid reusing themes, sentences, dialog or descriptions
- Continue unfinished sentences
- > means an action attempt; it is forbidden to output >
- Show realistic consequences
```

[-]

SportEffective7350@reddit

I was looking for a system prompt for this kinda thing, so let me join in in the thanks.

Now I just have to figure out something similar to what AIDungeon had for scifi prompts. Everything I find is too space-opera-ish but AIDungeon had a nice sort of cyberpunk-ish urban setting which I enjoy more.

That aside! I hope you guys regain momentum. I miss when some youtubers would play AIdungeon and share the madness with the audience and I hope things can return to that someday.

[-]

PikachuDash@reddit

Could you maybe share the reason why AI Dungeon sticks with Mistral Nemo? It's quite an old model after all. Is Mistral Small not better as a base for finetuning?

[-]

SportEffective7350@reddit

Nemo is surprisingly capable for its size category, I'm not surprised! You can squeeze a lot of juice from what's basically a 12B 14-months-old (so...ancient in AI years) model.

Was really surprised to see Wayfarer 2 being something I can actually run. Allow me to join you in thanking them.

[-]

NottKolby@reddit (OP)

We're constantly looking for new models, but Nemo continues to crush it for finetunes. Note that there are occasional usage policy details that restrict us from finetuning every model.

[-]

LinkSea8324@reddit

Well AI Dungeon, that was unexpected.

[-]

nnxnnx@reddit

nice work. wen gguf?

[-]

0r3ta@reddit

As the models move forward, can the Developers train the model to work with third person POVs? I feel that my scenarios feel more refined and each have their own character when they're in third person, rather than always being in second. Deepseek excels in that, but I and many others would love if future models also worked well in 3rd person POV. Thanks!

[-]

elite5472@reddit

Instruction adherence is pretty bad. Granted, WF2 is a 12b model, but for something that's meant for roleplaying, 5k tokens in instructions + lore shouldn't be that much of an ask.

I'll try Nova next, but like most of these L3 finetunes, I'm not expecting much.

[-]

NottKolby@reddit (OP)

Hopefully Nova is better, but you are correct. Our 12B finetunes are not the best at instruction following. They are especially optimized for the format and content of AI Dungeon.

[-]

elite5472@reddit

It's frustrating because these and other fine tunes I've tried have excellent writing.

Have you guys thought about switching over to GLM Air or GPT OSS as a base in the future?

[-]

NottKolby@reddit (OP)

Both good suggestions! They are on my list of models to investigate.

[-]

toothpastespiders@reddit

If you're taking suggestions, I haven't had a chance to really do any hard testing of Seed OSS 36B but I feel like it's been one of those things that wound up released at an unfortunate time and wound up forgotten amid larger brand's releases. It's been surprisingly strong for me from just playing around with it a bit and I've heard similar from others.

[-]

Mirrowel@reddit

GPT OSS is terrible, my god.
Refusal galore.

[-]

elite5472@reddit

That's what finetunes are fore.

[-]

someguy@reddit

5k tokens in instructions + lore

500 tokens is what you wanna use with tiny models.

[-]

bralynn2222@reddit

great work!

[-]

Inevitable_Ad3676@reddit

It's been a while since I've been on AI Dungeon, first ever thing that got me to AI chatbots seriously, and I've stopped visiting/using because of that one major controversy that I don't even know what it's about now. Is it pretty good now?

[-]

NottKolby@reddit (OP)

The team has done a great job building a trusting and transparent relationship with players. The community is indeed much healthier.

[-]

guiopen@reddit

Has you guys tried mistral small 3.2 base model? Very good instruction following and long context memory,

[-]

Mirrowel@reddit

It is already available in AI Dungeon. Their finetune is harbinger

[-]

Awwtifishal@reddit

Awesome!

Have you considered fine tuning GLM-4.5-Air (109B)? It's bigger than llama 70B, however it runs at a decent speed in my potato (after taking my whole RAM and VRAM), much faster than even 32B dense models, and with decent quality even at Q2_K_XL

[-]

NottKolby@reddit (OP)

It's in my queue of models to eval. Thanks for the tip!

[-]

eggs-benedryl@reddit

Are these meant to use the Ai dungeon format? See, Do, Say, etc.

[-]

NottKolby@reddit (OP)

Yes they are trained to use the input format "> You...". Also, past user inputs and AI responses are broken up into multiple user messages.

[-]

eggs-benedryl@reddit

Ah that's cool. I like that approach. Since they don't have an option to load your own model (that I'm aware of) releasing their finetunes is nice.

[-]

NottKolby@reddit (OP)

Thanks! I'm actually head of AI at AI Dungeon, although another team member created these finetunes. We hope that open sourcing these models will have a positive impact on the broader LLM community.

[-]

eggs-benedryl@reddit

Neat yea, AID is how I got in to all of this in the first place. I ended up dropping off after the big ui reboot years ago. I had hundreds of scenarios that were played often. It was fun cranking those out and sharing them. Fun creative writing exercise.

I'll have to book mark them. I've tried kobold on and off but no other ui ever scratched the same itch like the OG AID.

[-]

NottKolby@reddit (OP)

We have big plans so stay tuned!

[-]

toothpastespiders@reddit

and making death a distinct possibility for all characters—not just the user

Nice. One of my big dream projects is to make a solid murder mystery game. The ever increasing positivity bias in local models is a worrisome trend. Prompting can only go so far in getting past that.

[-]

NottKolby@reddit (OP)

We're constantly on the lookout for good models that have not optimized out all the fun!

[-]

jacek2023@reddit

That's a great news guys, first Wayfarer was nice, will check new ones, thanks for sharing!

[-]

NottKolby@reddit (OP)

The new version should be much the same with better consistency and reduced cliches do to improved datasets. Let us know what you think!