I open sourced a local-first LLM wiki for research and durable memory

Posted by Sad_Light_1354@reddit | LocalLLaMA | View on Reddit | 4 comments

I’ve been building a small tool called oamc around a workflow I wanted for personal research and long-running project memory.

The basic idea is: instead of repeatedly querying raw notes/documents, sources get ingested into a maintained markdown wiki. The wiki becomes the working knowledge layer, and future questions are asked against that layer instead of against raw text every time.

The pipeline is:

drop or clip sources into an inbox
ingest them into source, concept, entity, and synthesis pages
ask questions against the wiki
save useful answers back as new synthesis pages

A few things I cared about:

local-first workflow
markdown as the actual knowledge layer
inspectable files instead of hidden memory
lighter than standing up a full RAG stack
works well with Obsidian, but doesn’t depend on it conceptually

There’s also a small local dashboard and a macOS menubar app so it can keep running in the background.

This was inspired by Andrej Karpathy’s “LLM Wiki” idea. I was basically trying to turn that pattern into something I’d genuinely use day to day.

Repo:
https://github.com/michiosw/oamc

I’d especially love feedback from people here on:

wiki-first vs RAG-first for personal knowledge
where this approach starts breaking down at scale
whether markdown artifacts are actually a better interface for long-term LLM memory than embeddings + retrieval alone

[-]

Estanho@reddit

Hey OP, love the initiative. Left a star on your repo!

I'd like to try answering some of the things you asked.

On where it breaks at scale, the failure modes I keep hitting with this pattern:

- Synthesis pages calcify. New sources that contradict an existing synthesis tend to get softened in rather than retracting the older claim. The wiki ages toward consensus mush unless you build in an explicit "this supersedes that" mechanic.

- Provenance decay. A synthesis from six months ago cites three sources, two of which have since been updated or proven wrong upstream. Nothing in the wiki knows. Re-ingesting the source rarely cascades cleanly to the synthesis pages that depended on it.

- Concept drift. The same idea re-emerges under a slightly different page name and the LLM doesn't always catch the dupe. Backlinks help. Embeddings on page titles help more.

- Markdown is a great log and a lossy graph. Wikilinks are unidirectional by default, so unless you maintain backlinks the structure quietly degrades.

On wiki-first vs RAG-first, and whether markdown beats embeddings: one axis I think Karpathy's framing flattens. His pattern points the LLM at external sources and lets it author the synthesis. The inverse points it at notes you write yourself and lets it only maintain them. Same loop, different source of truth. oamc sits firmly in the first camp, which is a coherent place to sit, just worth naming because the "second brain" framing usually means the second one. Wrote it up here if useful: https://scribelet.app/blog/karpathy-llm-wiki-reaction

I skimmed `ops/ingest.py` and looks like you sidestep half the invalidation problem by making syntheses lazy (regenerated per `query --template synthesis`), but the entity and concept pages produced at ingest time don't seem to have an explicit supersession step, just `existing_pages` handed to the LLM as context. Curious whether you've thought about pushing supersession into a separate pass rather than relying on the ingest prompt to notice.

I open sourced a local-first LLM wiki for research and durable memory

Cosmicdev_058@reddit

crantob@reddit

Legal_Nectarine_4798@reddit

Estanho@reddit