Obsidian Second Brain Model??
Posted by 220nyx@reddit | LocalLLaMA | View on Reddit | 29 comments
I got a MacBook Pro M4 Pro 24GB Unified RAM
I was wondering if anybody here uses local LLM models as their second brain director for Obsidian.
- Summarise notes
- Link notes
- Tag notes
- Going deeper into the notes
- etc
I’ve only recently began testing what specific model would be good with this and with my specs, any suggestions?
Agreeable_Degree5860@reddit
with those specs you can definitely run some solid local models for rag. i've been messing with this exact setup lately.
the 24gb ram is your key constraint, so you'll wanna look at 7b or maybe some quantized 13b parameter models. i've found they handle summarization and basic linking tasks pretty well on similar hardware.
for your main goal of a vault rag pipeline, focus on models with strong instruction following. the quantized versions of mistral or llama 3 variants are a good starting point to test.
Little-Tour7453@reddit
Start with quantized 8-9B models. Bigger models might work but 24GB might be a bit stretch. I would say experiment between Gemma and Qwen 3.5
220nyx@reddit (OP)
I’ve been using Qwen3.5:9B Q8 and it is amazing for chatting and asking logical in depth questions - but it can overthink and hallucinate, despite having 262K context.
I haven’t tried any Gemma ones so I will be sure to look into them later today.
Thank you.
Little-Tour7453@reddit
3.5 has a different no think gate than 3. thinking: no or something like that. That stops thinking.
220nyx@reddit (OP)
What do you mean by that exactly?
Little-Tour7453@reddit
You can make Qwen stop thinking by playing with system level prompts. Each model has its documentation, easy to find how.
220nyx@reddit (OP)
I see, so I can personally tweak the local LLM ─ will try this out with Claude Code.
Little-Tour7453@reddit
Yep. Almost every localLM is adjustable. Usually for smaller models like anything smaller than 27B, no think is the best method to get the most. With 8 billion params reasoning is just noise anyway.
220nyx@reddit (OP)
Okay, so use no think - I tend to use think when logically, I needed the model to think, but I will test no think for those cases.
Little-Tour7453@reddit
Sure and trust me, small models have nothing to think. They repeat same sentences forever. You can stream on python and watch what ‘they think’.
Little-Tour7453@reddit
One exception to that is Smol3M. It’s so tiny and often gets ignored but reasons like a boss. Only 1.9GB, runs even on iPhone. I use it for some of my weird iOS apps.
220nyx@reddit (OP)
Do you use LLM or frontiers to vibe code or do you code yourself?
Little-Tour7453@reddit
I can code myself but why would I? I build the architecture and the system, Claude handles the rest.
220nyx@reddit (OP)
By architecture you mean the narrative for the app that you are coding and do you use Claude to improve it?
What do you mean by the system?
And Claude code?
Little-Tour7453@reddit
System architecture is where you build your apps pipeline, how it functions, what it does, where it lacks and where it rocks.
AI can do that too as well but it will misinterpret your context, assume things, forget things behind and you will end up with design and code debt that you need to clean up. Not its fault though. This is what it capable of currently or how it’s been designed, basically to burn more tokens.
So architecture can be as simple as an md file or you dozens of pages of content, test scenarios, feature roadmaps etc. anything that can hold hands with the AI, becomes its bible.
Yeah I named it Claude but by doing what I just said above, honestly any flagship model will result the same. I prefer Claude because of its personality. My second preference is Qwen Coder because Alibaba actually does great stuff and my last choice is GPT. They lobotomized that model.
220nyx@reddit (OP)
Okay now, I hear where you're coming from, thank you for this man, I learnt a lot about LLMs from you.
I don't have any problems with Claude, to be honest I think claude is one of the best AI model going - the only thing I really hate about claude is the usage and currently I am broke as fk I can't afford the higher paid tiers, if I could I would've done it right away.
- As of now, I'm only using the £20 tier.
Do you mind sharing your specs for your machine that runs local LLM models what's your peak parameters that you can run with it?
Little-Tour7453@reddit
Then you should give a try to Qwen Code CLI. It’s free-ish.
220nyx@reddit (OP)
Yeah I was looking at the thoughts it was having and it just kept going and going, was so useless and waste of time.
jacek2023@reddit
Is Obsidian funny local/offline? I had impression in the past that the data is in the cloud or that code was not open source but maybe I am wrong. I was experimenting with zettlr + LLMs, but still need to work on full workflow.
220nyx@reddit (OP)
It is not open source, you can go with Emacs if you want open source ─ learning curve is steep tho.
Data stores locally, but theres an option to stores to the cloud for device sync ─ I believe there are work arounds for this.
What LLM model are you currently using - is it a local model or a frontier one?
jacek2023@reddit
unfortunately I use vim
220nyx@reddit (OP)
If it is unfortunate why not make the switch?
I’ve been procrastinating on Emacs for almost a year now, I am scared to learn it and also cba lmao.
valeeraslittlesharky@reddit
I am using logseq with https://github.com/ergut/mcp-logseq. It's pretty close to obsidian workflow while being fully local.
jacek2023@reddit
but is it for any model or just claude?
needlzor@reddit
It is, although they do have a sync service. Are you thinking of Notion, by any chance?
jacek2023@reddit
I use Notion. But this is a cloud solution. I can read it in the train on my phone and I can edit it on my desktop. This is for technical stuff, etc. For more private stuff I need solution like zettlr and local LLMs.
xeeff@reddit
I recently tested it out briefly using gemma 4 and qwen3.5 (MoE variants) with YOLO plugin and it ran okay. definitely the best plugin though.
220nyx@reddit (OP)
Im gonna try Gemma 4 and benchmark it later with my specs.
Qwen3.5:9B Q8 Is amazing for deciphering notes, writing and explaining things - huge difference with Q4.
xeeff@reddit
the only reason I went with this is because I've had note companion starred on github for the longest time but the self hosting process was too complex and I'm too busy atm, and YOLO made it extremely easy to set stuff up. using it since yesterday so not much experience but I can say it works