Anyone tried this yet? LLM with knowledge date in the 1930s

[-]

SpaceTraveler2084@reddit

i want to run this either on RTX 5070Ti or M4 16GB, any tips?
using ollama

[-]

It seems a bit undercooked at the moment, I couldn't get it to quantize to a gguf. I was hoping someone would figure it out in the comments. Hopefully when they are ready they get it in a gguf-able state so the rest of us can try it.

[-]

n1776@reddit

I'm working on it, stay tuned!

[-]

qwen_next_gguf_when@reddit

GGUF ready ? If so I will give it a spin.

[-]

mkmarek@reddit

I gave it a try and implemented the model in llama, so If you want you can try it, but you'll have to compile it from my fork and then convert the model to gguf and run it with this compiled version.

https://github.com/mkmarek/llama.cpp/tree/talkie-model

I've never wrote anything for llama.cpp, so it probably has bugs, but so far I'm getting pretty similar experience as on their website with Q8 and even lower quants. Sometimes it behaves very weird, but not sure if it's the model or my implementation. Tested it with CUDA (a lot) and Vulkan (a little), not sure if the rest is working OK.

There is a convert_talkie_to_gguf.py file that you can use to convert the files from their HF, It will produce f16, which you can quantize further if you want. Instructions on the top of the file.

[-]

pmttyji@reddit

Not yet

https://huggingface.co/talkie-lm/talkie-1930-13b-it

[-]

SpaceTraveler2084@reddit

i cannot run this on ollama? i only found in the link 26gb version and MLX.

[-]

pmttyji@reddit

ollama is one of the wrapper so llama.cpp support is must. Only after that it'll work.

[-]

knlgeth@reddit

You can actually ingest resources imo, I'm using https://github.com/atomicmemory/llm-wiki-compiler, seems inspired by karpathy.

[-]

Distinct-Shoulder592@reddit

I have been using it for some while and it works great nil

[-]

grim-432@reddit

I’d want to see if I could get it to invent something from the 1940s.

Would be a way to back test llm ability to innovate and invent.

[-]

portmanteaudition@reddit

Backtesting on data used in training is mostly pointless. Recognize also the data used to train these models - the vast majority of data is more recent and presents a very different model of what text and images pertaining to 1940 would look like than if the models were trained only on a 1940 corpus.

[-]

Waste-Ship2563@reddit

They are currently trying to teach it python!

[-]

MatlowAI@reddit

This is a fun project. RLVR with world knowledge trained in chronologically and the VR be known inventions. See if you can keep the model coherent. Data would be a giant pain to curate but it's funny we were talking about this the other day at work.

[-]

PigSlam@reddit

“Gemini, invent an automatic transmission for an automocar”

[-]

nabagaca@reddit

"I will look at the best materials available to be. It will use asbestos for insulation, lead for its housing, and will be lubricated with radium water"

[-]

philmarcracken@reddit

You're absolutely right. That was a hallucination, of yours this time.

[-]

skinnyjoints@reddit

I’ve been thinking about this a lot. With a big enough dataset, I think LLMs would be capable of learning the patterns of invention and discovery.

[-]

Scared_Bedroom_8367@reddit

Well, it just hallucinated a diagram label when there was no diagram

[-]

Due-Memory-6957@reddit

It really wouldn't, it still has all the knowledge.

[-]

imp_12189@reddit

"As Demis Hassabis has asked, could a model trained up to 1911 independently discover General Relativity, as Einstein did in 1915?"

That would be.. Wow

[-]

34574rd@reddit

someone did try it https://michaelhla.com/blog/machina-mirabilis.html

[-]

ORANGE_J_SIMPSON@reddit

Wow this is actually a really cool experiment

[-]

imp_12189@reddit

Well, that's really fun to read, thank you

[-]

DarthFluttershy_@reddit

With guidance, probably... though I'd still be suspicious that the data set wasn't properly curated. That can be hard, because sometimes stuff has the wrong metadata.

[-]

Mickenfox@reddit

It would probably be better to try to replicate discoveries from the last year with Claude 4.

[-]

redditscraperbot2@reddit

This is great news for my hitler not.

[-]

Sooperooser@reddit

No competition for Grok on the Hitler-benchmark.

[-]

RottenPingu1@reddit

What does Grok have to do with this?

[-]

reallmconnoisseur@reddit

[-]

redditscraperbot2@reddit

It's a funny model. Ask it about the feasibility of anything that happened during world war two and it will imply (in the nicest way it can for the time) that it's completely preposterous.

[-]

Vaguswarrior@reddit

Jesus Christ I snorted out loud

[-]

buttplugs4life4me@reddit

This may not be ethically good, but it would actually be interesting to train models solely on one historical person. Theres ample writing from Hitler and about Hitler, and if you're not too strict and throw in Himmler, Göbbels and so on into it as well, you'd probably have enough data

[-]

redditscraperbot2@reddit

Throw both of their work into a dataset and call it Gobbler

[-]

redragtop99@reddit

Main Kampf sounds good right about now.

[-]

FoxiPanda@reddit

Excuse me your what?

(This made me laugh more than it should have tbh. Well played.)

[-]

slippery@reddit

Seems like it would be very hard to filter training data that precisely. I'm skeptical.

[-]

SnooPaintings8639@reddit

All newspaper, books and any other available materials with proven dates. Of course, there is a question of quality, but if you did translate sources from all the languages from the ancient times to eg 1930 it might be enough.

I am still to figure out how tokenize prehistoric cave paintings tho.

[-]

slippery@reddit

You tokenize images of the cave paintings.

[-]

No_Afternoon_4260@reddit

The question is are you training from scratch or using a pre trained one? A 13B from scratch only on (book/newspapers?) From the 30's? I'd like to see that

[-]

TheRealMasonMac@reddit

Hopefully they'll release the dataset later on. Everything should be public domain and I see no reason as to why they shouldn't.

[-]

Fluffy-Brain-Straw@reddit

This is amazing

[-]

-LaughingMan-0D@reddit

This is so sick. I love the way it talks.

[-]

Dany0@reddit

It's so very adamant that women should not have the right to vote - I am convinced this is because of a terrible imbalance in the training data. I hope they re-OCR everything and balance the training set

I would go at it from two directions, remove or heavily reduce obscure literature that is at the same time basically a plagiarism/copy of other more popular publications. For example it was common for newspapers to reprint slightly reworded news from distributors (AP/Reuters heyday), and also focus on reducing "yellow paper" kind of stuff

At the same time I would give more weight to more niche literature. Before the fax and internet, literature had a larger set of words, but they would be used infrequently. Without sufficient examples I'm afraid their experiment likely forgot more niche things entirely, and especially given the relatively small param size

[-]

GizmoR13@reddit

[-]

Novel-Injury3030@reddit

where are u trying it?

[-]

datbackup@reddit

did they seriously choose to call this “vintage language models”

that is such a stupid name

cool idea

but that name is sheer idiocy

[-]

HairyAd9854@reddit

The LLM 1930s cutoff had already been tried a few times, but they always insisted on invading Poland.

[-]

Voxandr@reddit

lol

[-]

woadwarrior@reddit

Evidently, there are some temporal data leaks in its training corpus.

[-]

SubstanceNo2290@reddit

How on earth are you gonna ensure it isn’t contaminated or the material was edited after 1940

[-]

VeganBigMac@reddit

The talk a bit about it in the article.

[-]

Yorn2@reddit

Oh man, hook something like this up with a TTS designed to speak in a "Mid-Atlantic" (think old time newscasters reporting on WWII in American English) accent and you have a great new news app idea to wake up to.

[-]

vox-deorum@reddit

The instruction tuning part is prone to contamination. So even if someone invents something 1960s with this model can’t prove much.

[-]

PhlarnogularMaqulezi@reddit

This sounds fascinating af. Keeping my eyes peeled for a GGUF to pop up.

A Q6 quant would probably fit my laptop's 16GB of VRAM, based on models similar in size.

[-]

toothpastespiders@reddit

Seems like one of the more interesting projects that I've seen in a while. I love that they're providing the base model too. And their future plans also seem solid. The model seems shockingly competent given the low amount of training data. Tossed a few questions at it regarding choices that would be ideal for specific historical times and places and I felt the results were passable if not ideal. That's really praise rather than criticism though. I'm mostly just impressed that it's working as well as it has from the couple quick off the top of my head tests. Sadly, machine readable data from publications of the time isn't as common as a lot of people would assume.

[-]