4Chan data can almost certainly improve model capabilities.
Posted by Sicarius_The_First@reddit | LocalLLaMA | View on Reddit | 104 comments
The previous post was probably automoded or something, so I'll give you the TL;DR and point you to search for the model card yourself. Tbh, it's sad that bot posts / posts made by an AI gets prompted, while human made one gets banned.
I trained 8B on 4chan data, and it outperform the base model, did the same for 70B and it also outperformed the base model. This is quite rare.
You could read about it in the linked threads. (and there's links to the reddit posts in the model cards).
https://preview.redd.it/6u0vsqmccltg1.png?width=3790&format=png&auto=webp&s=324f71031e00d99af4e9d3884ee9b8a8855a44af
104 Comments
TheRealDatapunk@reddit
Sicarius_The_First@reddit (OP)
Sicarius_The_First@reddit (OP)
RandumbRedditor1000@reddit
Sicarius_The_First@reddit (OP)
Persistent_Dry_Cough@reddit
Jluxo_@reddit
Persistent_Dry_Cough@reddit
my_name_isnt_clever@reddit
Sicarius_The_First@reddit (OP)
TheRealDatapunk@reddit
dinerburgeryum@reddit
TheRealDatapunk@reddit
dinerburgeryum@reddit
seanthenry@reddit
Sicarius_The_First@reddit (OP)
Sicarius_The_First@reddit (OP)
Paradigmind@reddit
Ardalok@reddit
Sicarius_The_First@reddit (OP)
Sicarius_The_First@reddit (OP)
insulaTropicalis@reddit
atineiatte@reddit
AnOnlineHandle@reddit
PunnyPandora@reddit
AnOnlineHandle@reddit
Sicarius_The_First@reddit (OP)
MixtureOfAmateurs@reddit
Far_Composer_5714@reddit
waiting_for_zban@reddit
Sicarius_The_First@reddit (OP)
BannedGoNext@reddit
Sicarius_The_First@reddit (OP)
Bobby72006@reddit
StefanStef14@reddit
denoflore_ai_guy@reddit
Sicarius_The_First@reddit (OP)
IrisColt@reddit
Sicarius_The_First@reddit (OP)
IrisColt@reddit
roosterfareye@reddit
seanthenry@reddit
Southern-Chain-6485@reddit
Sicarius_The_First@reddit (OP)
roosterfareye@reddit
raika11182@reddit
Sicarius_The_First@reddit (OP)
roosterfareye@reddit
Ardalok@reddit
Sicarius_The_First@reddit (OP)
roosterfareye@reddit
Vivarevo@reddit
81stredditaccount@reddit
Sicarius_The_First@reddit (OP)
Puzzleheaded-Drama-8@reddit
Sicarius_The_First@reddit (OP)
PurpleWinterDawn@reddit
Sicarius_The_First@reddit (OP)
FastDecode1@reddit
a_beautiful_rhind@reddit
kaisurniwurer@reddit
a_beautiful_rhind@reddit
Luke2642@reddit
Needausernameplzz@reddit
yall_gotta_move@reddit
Sicarius_The_First@reddit (OP)
ganonfirehouse420@reddit
Sicarius_The_First@reddit (OP)
CommunismDoesntWork@reddit
Sicarius_The_First@reddit (OP)
cutebluedragongirl@reddit
freia_pr_fr@reddit
my_name_isnt_clever@reddit
Imaginary-Unit-3267@reddit
Sicarius_The_First@reddit (OP)
synth_mania@reddit
maorui1234@reddit
My_Unbiased_Opinion@reddit
rinmperdinck@reddit
MerePotato@reddit
314kabinet@reddit
Terrible-Mongoose-84@reddit
Sicarius_The_First@reddit (OP)
Lorian0x7@reddit
RandumbRedditor1000@reddit
Sicarius_The_First@reddit (OP)
Koalateka@reddit
RandumbRedditor1000@reddit
Sicarius_The_First@reddit (OP)
Koalateka@reddit
alphapussycat@reddit
Hoppss@reddit
Sicarius_The_First@reddit (OP)
Hoppss@reddit
insulaTropicalis@reddit
lizerome@reddit
Sicarius_The_First@reddit (OP)
Sicarius_The_First@reddit (OP)
cgs019283@reddit
Sicarius_The_First@reddit (OP)
Sicarius_The_First@reddit (OP)
My_Unbiased_Opinion@reddit
Sicarius_The_First@reddit (OP)
My_Unbiased_Opinion@reddit