what do you use your local llm?
Posted by FormalAd7367@reddit | LocalLLaMA | View on Reddit | 30 comments
what do you use your local llm for?
for me, i run everything on linux and it ends up generating an api i can plug into other stuff.
on my laptop (and for personal projects), i mostly use it for coding help—then i’ve got an ai agent (not openai) that monitors stock prices and my home price. it also helps manage my notes by running obsidian tasks for me. i am almost everything close model. from web search (searing/perplexcia) to coding, i only use gmail.
at work, we get to work with cursor and other frontier stuff
is there anything i can consider improving my life?
FakeFrik@reddit
I have an n8n workflow set up that reads emails, manages my calendar, checks my todo list for work. Works with local TTS / STT and i chat to it through telegram. All running locally on my 4090. I'm still running gpt-oss:20b since i find it to have a good sweet spot for size vs speed vs capability. Since i'm running TTS and STT models too, i don't have enough vram to run gemma / qwen.
FormalAd7367@reddit (OP)
how do you by/pass your company security to have your workflow reads your work emails and calendar? i tried to do that but was asked to get approval from admin
FakeFrik@reddit
i have my work calendar shared with my personal gmail, and emails forwarded.
I work at a startup and there are less rules regarding this.
beyourownmvster@reddit
I joined this subreddit to learn from people who clearly know a lot here, because I want to use a local AI for research, I'm tired of using ChatGPT, Claude and Perplexity which are all limited. I could pay for them but I'd rather learn how to run something locally on my laptop. Everyone seems really helpful and generous with advice so I'm waiting to get enough karma to post my questions and hopefully get some guidance.
NNN_Throwaway2@reddit
Web search.
Qwen 3.5 397b legitimately gives better results than the paid providers just using a web search tool for duckduckgo.
ForsookComparison@reddit
More people need to revisit 397B. The benchmarks did it dirty.
Wix86@reddit
So does it have ability to make tool calls to internet(APIs) or Do you use only Qwen for complete privacy? Maybe some RAG?
Last few days I'm thinking I could download Grokipedia, index it and talk to it through small model. Or maybe something like that already exist?
NNN_Throwaway2@reddit
I scrape duckduckgo. Not completely private, but more private than using google logged in, and could be made more private with proxying or other measures. Given the way internet search is going, it probably won't be a viable method in coming years, but for now it works.
If you want something downloadable, Wikipedia offers a full download.
Jipok_@reddit
Better than exa?
NNN_Throwaway2@reddit
Who?
Jipok_@reddit
Exa . ai
Web Search API, AI Search Engine, & Website Crawler
Dry_Yam_4597@reddit
Search has become so bad that even the smaller models are better.
merica420_69@reddit
I'm building an automated multimedia pipeline for short form content. All local generation, scripting, prompting, image, video, narration, captions with timing, assembly. I do use codex to help with the coding.
FormalAd7367@reddit (OP)
i’ve done (for a friend) that did the opposite side. I’ve an automated pipeline that what’s trendy on different social media platforms.
cibernox@reddit
I could use advice from you both on how to automatically generate good quality content to promote my gardening saas in instagram with educational videos.
TinFoilHat_69@reddit
I use qwen 3.5 27b opus distilled reasoning to sanity check tools that opus, sonnet, haiku have no problems using. This enables opus to play god and qwen then operates in my computerized dimensional space it interacts in.
FormalAd7367@reddit (OP)
amazing. do they communicate with each other in an environment by using using an orchestrator script? or an agent framework (like OpenClaw, LangGraph etc)?
TinFoilHat_69@reddit
They rarely speak directly but opus can inject requests into qwens environment because it’s tmux.
Qwen runs inside a tmux pane, so I just prompt opus or whomever to read the screen buffer to analyze Qwens interactions within the tooling environment.
“Qwen is having issues opening Firefox check his screen buffer”
Sometimes I’ll ask Qwen to give me some input on the tools or what it would like, based on using the tools. I also then drop in the tool opus crafts into Qwen tooling environment.
At this point opus not only sees Qwen’s thoughts and actions but now has deep awareness into ideas or creativity from the trenches. Keeps my code sharp during designing architectural components or overcoming certain engineering challenges with an effective yet intuitive approach to designs.
FormalAd7367@reddit (OP)
nice set-up. that’s how i code everyday at work
Hour_Bit_5183@reddit
None. I tried so many and can't find a use for them and I doubt I will find an insane difference between the insane and sane sized ones. I tried 50. I'm pretty much done.
TimmyIT@reddit
Excuse for telling myself that I need this upgrade...
Bharat01123@reddit
Mostly Translation purpose. I find Gemma 4 models extremely good at translating, even the smaller 2B model.
I even create small scripts related to it. For example I created a script that shows both - original and translated texts side by side, and hovering over will show both sentences in color, similar to Google Translation.
FormalAd7367@reddit (OP)
i can’t get Gemma to work properly…..which local model are you using?
Bharat01123@reddit
Funny, just after replying here, I opened YouTube, and found this video on home page :
Pick the Wrong Gemma 4 and You'll Think It's Broken | FOUR Models Compared!
https://www.youtube.com/watch?v=SLOqlEmuy5U
Bharat01123@reddit
I am running small gemma-4-E4B-it quantized model. I dont find much difference between Q4 and Q6, so using Q4 for faster output.
theUmo@reddit
Troubleshooting assistant
SmallRice@reddit
Mostly web research/summaries when I'm too lazy to do it myself. But mostly to prep myself for the inevitable future when local models are good enough to be the main model.
10F1@reddit
Using the new qwen3.6 models for writing code documentation and tests for work and personal projects.
I hate doing both of those things.
ForsookComparison@reddit
Justify hardware purchase
jikilan_@reddit
Just playground and for learning purposes . Cant get too serious with it in work. Of course here i mean big scope for work for those <120b models.