Can 4chan data REALLY improve a model? TURNS OUT IT CAN!

Posted by Sicarius_The_First@reddit | LocalLLaMA | View on Reddit | 157 comments

Hear me out, no one (really) knows how these things work. A few days ago, I released [Assistant\_Pepe\_8B](https://huggingface.co/SicariusSicariiStuff/Assistant_Pepe_8B), you can read the discussion in [this thread](https://www.reddit.com/r/LocalLLaMA/comments/1qppjo4/assistant_pepe_8b_1m_context_zero_slop/). I trained it on an extended **4chan dataset**, on an abliterated base, but what I didn't expect was to get this: https://preview.redd.it/lrqwx8ca1ugg1.png?width=2333&format=png&auto=webp&s=4dcfcfb9c107fa3d417e5ff623c4952e5e2ab457 https://preview.redd.it/a3bby1yd1ugg1.png?width=2980&format=png&auto=webp&s=8f050bbd512a12a359626af79ccebcd2d2445877 Somehow, **against all common sense**, the model **outperformed** nvidia's nemotron, the base it was trained on. This is usually the other way around. You take a smart base, tune a model on it, and accept the sacrifice of some intelligence to give it flavor. At first I thought "OK nice, a coincidence, who cares?" But then I looked more closely at the scores: 1) The abliterated base **scored higher** than the base. 2) The finetune scored even **higher than both**. 3) The finetune was literally on an extremely noise 4chan dataset, it should have eaten glue. And then I remembered something: the original, gpt4chan (by Yannic Kilcher) scored especially high in truthfulness (that was b4 benchmaxxing). So I took a closer look on recent models I released; the abliterated Impish\_LLAMA\_4B not only outperformed the base tune (the unabliterated one), it also changed its political alignment (you can check for yourself the UGI stats, I feel like I spammed enough images). People were initially joking about the "alignment tax", I think there's a none trivial substance in all of this. It seems to me just above a marginal error or statistical noise. Oh, and the KL divergence for Impish\_LLAMA\_4B was : <0.01