New Nemo finetune: Impish_Nemo

Posted by Sicarius_The_First@reddit | LocalLLaMA | View on Reddit | 52 comments

Hi all,

New creative model with some sass, very large dataset used, super fun for adventure & creative writing, while also being a strong assistant.
Here's the TL;DR, for details check the model card:

My best model yet! Lots of sovl!
Smart, sassy, creative, and unhinged — without the brain damage.
Bulletproof temperature, can take in a much higher temperatures than vanilla Nemo.
Feels close to old CAI, as the characters are very present and responsive.
Incredibly powerful roleplay & adventure model for the size.
Does adventure insanely well for its size!
Characters have a massively upgraded agency!
Over 1B tokens trained, carefully preserving intelligence — even upgrading it in some aspects.
Based on a lot of the data in Impish_Magic_24B and Impish_LLAMA_4B + some upgrades.
Excellent assistant — so many new assistant capabilities I won’t even bother listing them here, just try it.
Less positivity bias , all lessons from the successful Negative_LLAMA_70B style of data learned & integrated, with serious upgrades added — and it shows!
Trained on an extended 4chan dataset to add humanity.
Dynamic length response (1–3 paragraphs, usually 1–2). Length is adjustable via 1–3 examples in the dialogue. No more rigid short-bias!

https://huggingface.co/SicariusSicariiStuff/Impish_Nemo_12B

[-]

julieroseoff@reddit

Hi there, sorry Im just a beginner but its possible to run it with vllm ?

[-]

Sicarius_The_First@reddit (OP)

of course. its a nemo based finetune, it got support from vllm about a year ago.

[-]

Skye_sys@reddit

I was waiting for something like this! Great work! Does it support tool calling tho?

[-]

Sicarius_The_First@reddit (OP)

Thank you :)

Tool calling was part of the dataset, but wasn't tested, please let us know your results!

(note: the dataset for tool calling was small and untested, it might work, it might not...)

[-]

Sicarius_The_First@reddit (OP)

Yes, very hard to do. This type of data usually lobotmizes models.

[-]

toothpastespiders@reddit

Out of curiosity, how did you handle grammar with the 4chan posts? Did you leave it as is or have a LLM rewrite it?

[-]

Sicarius_The_First@reddit (OP)

You can see the 4chan dataset UBW_Tapestries

[-]

Nice, thanks! I started adding 4chan scrapes to my dataset a while back but hadn't included it in the training yet. I figured that with proper attribution, it wouldn't taint the grammar unless that was the desired intent, but having evidence is great. My 4chan data's set up in a pretty analogous way to tapestries.

[-]

Sicarius_The_First@reddit (OP)

feel free to use tapestries :)

[-]

toothpastespiders@reddit

I'm so used to people not sharing datasets that I'm getting a bad habit of not even looking. I just looked at yours, and thanks in turn for making so much of your work there available!

[-]

HilLiedTroopsDied@reddit

trained on reddit is just as cringe.

[-]

Spectrum1523@reddit

no I mean

I know I'm a redditor but 4chan is definitely worse lol

[-]

Xamanthas@reddit

I made no such suggestions that it should be trained on reddit either.

[-]

Sicarius_The_First@reddit (OP)

It wasn't trained on reddit, and yeah, I agree.

While it has 4chan data, it's a tiny part of the total dataset.

[-]

Zestyclose_Yak_3174@reddit

Thanks for all the work you do. Especially love your bigger model experiments but I am looking forward trying this one.

[-]

Sicarius_The_First@reddit (OP)

Thank you so much, I appreciate your kind words :)

70B with this amount of data is unlikely, but 24B -32B is something I think about. Something worked really well in this tune, and I'm indeed curious to see if the same data would elevate a larger model in a similar manner.

[-]

Paradigmind@reddit

I would be very interested in the 24-32B range. Maybe a 27B with vision capabilities?

[-]

Sicarius_The_First@reddit (OP)

vision is a huge pain. there's a reason why there are only two models with uncensored vision ever made. one of them is mine :P

[-]

Sicarius_The_First@reddit (OP)

other than vision, which is a pain, 24b-32b is a very good size in terms of smartness & accessibility. i'm thinking about it.

[-]

Paradigmind@reddit

Okay thanks for considering it! (Which of your models has vision?)

[-]

Sicarius_The_First@reddit (OP)

https://huggingface.co/SicariusSicariiStuff/X-Ray_Alpha

[-]

Paradigmind@reddit

Thank you.

[-]

LicensedTerrapin@reddit

I know MoE is a pain but have you thought about giving them a try?

[-]

Sicarius_The_First@reddit (OP)

some GGUF quants are already up (Q6 & Q8), the rest being uploaded :)

[-]

Paradigmind@reddit

Do you have a link please? I couldn't find them.

[-]

Kindly-Annual-5504@reddit

Thank you for your work! It’s great to see that someone is still investing time in fine-tuning Nemo. Unfortunately, it has to be said that, to this day, there is no worthy successor in this area. I started with Nemo back then and was immediately in love. Nemo is something special… Not perfect, but still unique in its own way to this day.

I’ve now tested Impish Nemo and so far, I’m really impressed. I immediately feel “at home” again. Over the past year, I’ve tested countless fine-tunes of Nemo. Very few have managed to convince me in the long run. I’m very curious to see how yours performs here.

Thank you again for taking the time and giving Nemo (and also the people with 12GB or less VRAM) some love!

[-]

Sicarius_The_First@reddit (OP)

Thank you, I'm glad to hear it, and I feel the same :)

Nemo is indeed very pleasant model in general, I fully agree (a stark contrast with the latest openai model with harsh moralizing etc).

Indeed, I feel that everyone should have (free) access to AI, locally, because one shouldn't be dependent on the 'good graces' of closed model providers that can change policy at a moments notice, or by the current AI laws, and where they are headed towards.

[-]

jekle@reddit

This model is insanely good!!

[-]

Sicarius_The_First@reddit (OP)

Thank you so much :)

What did you like about it? Any specific things it does especially well?

[-]

jekle@reddit

I only tested it for creative writing and was quite impressed!

[-]

Sicarius_The_First@reddit (OP)

Glad to hear! It was trained on a huge books & light novels corpus, happy to hear it made a difference :)

[-]

Cool-Chemical-5629@reddit

Thanks for the link to the koboldai. I lost it long time ago. I mean... the link... 😂

[-]

Sicarius_The_First@reddit (OP)

hehe bookmark it :P

[-]

Dumbledore_Bot@reddit

How much context does this model support?

[-]

Sicarius_The_First@reddit (OP)

tested up to 16k, can probably do around 20k.

[-]

Echo9Zulu-@reddit

I have enjoyed your models very much! Thanks for your work. Will be cooking up some OpenVINO quants this afternoon

[-]

Sicarius_The_First@reddit (OP)

Awesome, glad to hear it, and thank you :)

[-]

HRudy94@reddit

Nice, this sounds interesting for sure, how would you describe the context length overall?
And how would you say it compares to say Starcannon Unleashed or Dolphin 2.9.3 Mistral Nemo?

[-]

Sicarius_The_First@reddit (OP)

Context length is good to at least 16k which I've tested, haven't tested beyond that.

[-]

jacek2023@reddit

I would love to see your "negative llama" finetune of some new model

[-]

Sicarius_The_First@reddit (OP)

This model inherited a lot from it :)

[-]

RedditDiedLongAgo@reddit

I'm confused. How would you say this works for characters not baked in?

Feels kinda like training on the benchmarks if you are training on known characters. Even though they do seem fun. xD

[-]

Sicarius_The_First@reddit (OP)

The model generalizes :)

[-]

Mickenfox@reddit

It's disappointing how much is left on the table in regards to LLMs. The only "personalities" we get are cheerful assistant and programmer. There's so little serious effort to produce a creative model.

[-]

Sicarius_The_First@reddit (OP)

Agreed, I try to help filling the gap, we definitely do not need more math assistants :)

[-]

AppearanceHeavy6724@reddit

Example output compared to stock Nemo? It is not like evreyone has terabit broadband and can download, test and delete . I, for example, live in a developing country, my actual speed is 10 mbit/sec, although the promise much more.

[-]