Did anyone of you already make the "doomsday" or "offgrid" knowledge based? (ofc powered with LLM)
Posted by Altruistic_Heat_9531@reddit | LocalLLaMA | View on Reddit | 13 comments
Basically, I’m really into the idea of a fully offline setup.
(Another way to say it: I’m a data hoarder.)
For LLMs, I’m using uncensored models from both Western (Gemma, GPT-OSS) and Eastern ones (GLM 4.7 Flash, Qwen 35B). For daily use, I stick to models in the 20–35B range, and when I need stronger reasoning, I switch to Qwen 3.5 120B.
Anyway:
- After looking around, Wikipedia (text-only, no media) is about 24 GB in English. I’m planning to include Indonesian (my country), Chinese, Russian, and Arabic as well, mainly to reduce bias. That would probably bring it to around 120 GB i guess for text-only data. For images, google estimating around 4 TB (and i dont know if it is ALL wiki or just English). I’m not planning to store videos. 4 TB is manageable using LTO for archival and HDD for day2day access.
- Planet.osm This is basically a map of the entire Earth. For my setup, I only need major roads outside Indonesia, but full detail within Indonesia. Has anyone here tried unpacking the planet file without full detail? When I processed just my home island (Java), processing edges and vertices increased the size to around 30 GB, from about 1.2 GB if I remember correctly.
- Any other suggestions for datasets or storage/setup optimizations? Especially from people who’ve already built similar offline systems?
riddlemewhat2@reddit
Yeah people are definitely building setups like this. Wikipedia + maps + docs is a solid base for offline knowledge.
The main thing to watch is not just storing data but making it usable. raw dumps are hard to navigate unless you add a good retrieval or structure layer. most people start with RAG, but it gets messy at that scale.
A lot of setups are moving toward compiling that data into a structured wiki so it is actually queryable and maintained over time. if you want a reference for that kind of approach, this is worth checking:
https://github.com/atomicmemory/llm-wiki-compiler
bonobomaster@reddit
In a doomsday scenario, you very likely won't have the power for so much needed compute.
Furthermore portability for so much compute sucks.
Forget LLMs for this use case.
I have an Android based e-reader with Kiwix installed and an offline version of Wikipedia.
It's slow but extremely portable and can be powered by a small solar panel.
derekp7@reddit
Doomsday can come in many forms. A political or war situation can have your country's internet cut off without warning. An algorithm at your ISP can decide you are no longer fit to be a customer, which may take a bit of time to get straightened out. Snowstorm can knock you offline for a week.
bonobomaster@reddit
Doomsday is pretty well defined and nothing you said has any influence on my previous answer.
My e-reader is prepared and doesn't need further internet to serve me most of humanities knowledge.
https://www.merriam-webster.com/dictionary/doomsday
while-1-fork@reddit
Scavenging a solar panel or making getho generators using car alternators or motors or making your own from anything with coper coils and magnets are possible. Even with no prior knowledge, having wikipedia and a small LLM in the phone for questions far exceding your knowledge you could likely figure out how to do something like building a wood powered stirling engine, hooking some car alternators and steping the voltage up to run computers and other equipment (or in the case of computers maybe just regulating the +12 if using a car alternator and generating the +5 and +3.3 directly from the +12).
Salt-Willingness-513@reddit
True that. I also just used half my storage on my palma 2 for german wikipedia
Likeatr3b@reddit
Are you using a Mac?
while-1-fork@reddit
I have offline text only wikipedia with kiwix in the smartphone as well as Qwen 3.5 4B which while not being the greatest it is already super slow on a mid range phone on cpu only, like 2T/s generation. But I can see it being useful with patience and if nothing else is available.
I am also thinking about keeping the 27B IQ4 weights and a copy of the llama.cpp source in there, and a docker that can run and cross compile it as well as having docker itself and just in case a linux dvd image too. Not meant to run it in the smartphone but because chances are in an eventual apocalypse I would have the smartphone in my pocket but I may or may not have my computer or pendrives so I am using the phone as storage for the time when I scavenge some computers and build my post apocalyptic assistant.
While I hope this won't be needed I think that it is a great idea to be prepared. Costs very little, the upsides in case it is needed are massive.
Salt-Willingness-513@reddit
No im fine having gemma 4 26b locally and german wikipedia offline on my ebook reader should be fine too.
miklosp@reddit
https://www.projectnomad.us/
Altruistic_Heat_9531@reddit (OP)
Holy.... nice this is what i am looking for
sagiroth@reddit
I don't think if doomsday comes this will be your primary concern.
Real_Ebb_7417@reddit
The new survival-freak kind 😂
btw. I'd get documentation for some most important programming languages and store the most important libraries of these languages locally. I'd also store some very in-depth technical knowledge (eg. car engines, academical physics, etc.)