what do you use your local llm?

Posted by FormalAd7367@reddit | LocalLLaMA | View on Reddit | 30 comments

what do you use your local llm for?

for me, i run everything on linux and it ends up generating an api i can plug into other stuff.

on my laptop (and for personal projects), i mostly use it for coding help—then i’ve got an ai agent (not openai) that monitors stock prices and my home price. it also helps manage my notes by running obsidian tasks for me. i am almost everything close model. from web search (searing/perplexcia) to coding, i only use gmail.

at work, we get to work with cursor and other frontier stuff

is there anything i can consider improving my life?

[-]

FakeFrik@reddit

I have an n8n workflow set up that reads emails, manages my calendar, checks my todo list for work. Works with local TTS / STT and i chat to it through telegram. All running locally on my 4090. I'm still running gpt-oss:20b since i find it to have a good sweet spot for size vs speed vs capability. Since i'm running TTS and STT models too, i don't have enough vram to run gemma / qwen.

[-]

FormalAd7367@reddit (OP)

how do you by/pass your company security to have your workflow reads your work emails and calendar? i tried to do that but was asked to get approval from admin

[-]

FakeFrik@reddit

i have my work calendar shared with my personal gmail, and emails forwarded.
I work at a startup and there are less rules regarding this.

[-]

beyourownmvster@reddit

I joined this subreddit to learn from people who clearly know a lot here, because I want to use a local AI for research, I'm tired of using ChatGPT, Claude and Perplexity which are all limited. I could pay for them but I'd rather learn how to run something locally on my laptop. Everyone seems really helpful and generous with advice so I'm waiting to get enough karma to post my questions and hopefully get some guidance.

[-]

NNN_Throwaway2@reddit

Web search.

Qwen 3.5 397b legitimately gives better results than the paid providers just using a web search tool for duckduckgo.

[-]

ForsookComparison@reddit

More people need to revisit 397B. The benchmarks did it dirty.

[-]

Wix86@reddit

So does it have ability to make tool calls to internet(APIs) or Do you use only Qwen for complete privacy? Maybe some RAG?

Last few days I'm thinking I could download Grokipedia, index it and talk to it through small model. Or maybe something like that already exist?

[-]

NNN_Throwaway2@reddit

I scrape duckduckgo. Not completely private, but more private than using google logged in, and could be made more private with proxying or other measures. Given the way internet search is going, it probably won't be a viable method in coming years, but for now it works.

If you want something downloadable, Wikipedia offers a full download.

[-]

Jipok_@reddit

Better than exa?

[-]

NNN_Throwaway2@reddit

Who?

[-]

Jipok_@reddit

Exa . ai

Web Search API, AI Search Engine, & Website Crawler

[-]

Dry_Yam_4597@reddit

Search has become so bad that even the smaller models are better.

[-]

merica420_69@reddit

I'm building an automated multimedia pipeline for short form content. All local generation, scripting, prompting, image, video, narration, captions with timing, assembly. I do use codex to help with the coding.

[-]

FormalAd7367@reddit (OP)

i’ve done (for a friend) that did the opposite side. I’ve an automated pipeline that what’s trendy on different social media platforms.

[-]

cibernox@reddit

I could use advice from you both on how to automatically generate good quality content to promote my gardening saas in instagram with educational videos.

[-]

TinFoilHat_69@reddit

I use qwen 3.5 27b opus distilled reasoning to sanity check tools that opus, sonnet, haiku have no problems using. This enables opus to play god and qwen then operates in my computerized dimensional space it interacts in.

[-]

FormalAd7367@reddit (OP)

amazing. do they communicate with each other in an environment by using using an orchestrator script? or an agent framework (like OpenClaw, LangGraph etc)?

[-]

TinFoilHat_69@reddit

They rarely speak directly but opus can inject requests into qwens environment because it’s tmux.

Qwen runs inside a tmux pane, so I just prompt opus or whomever to read the screen buffer to analyze Qwens interactions within the tooling environment.

“Qwen is having issues opening Firefox check his screen buffer”

Sometimes I’ll ask Qwen to give me some input on the tools or what it would like, based on using the tools. I also then drop in the tool opus crafts into Qwen tooling environment.

At this point opus not only sees Qwen’s thoughts and actions but now has deep awareness into ideas or creativity from the trenches. Keeps my code sharp during designing architectural components or overcoming certain engineering challenges with an effective yet intuitive approach to designs.

[-]

FormalAd7367@reddit (OP)

nice set-up. that’s how i code everyday at work

[-]

Hour_Bit_5183@reddit

None. I tried so many and can't find a use for them and I doubt I will find an insane difference between the insane and sane sized ones. I tried 50. I'm pretty much done.

[-]

TimmyIT@reddit

Excuse for telling myself that I need this upgrade...

[-]

Bharat01123@reddit

Mostly Translation purpose. I find Gemma 4 models extremely good at translating, even the smaller 2B model.
I even create small scripts related to it. For example I created a script that shows both - original and translated texts side by side, and hovering over will show both sentences in color, similar to Google Translation.

[-]

FormalAd7367@reddit (OP)

i can’t get Gemma to work properly…..which local model are you using?

[-]

ForsookComparison@reddit

Justify hardware purchase

[-]

jikilan_@reddit

Just playground and for learning purposes . Cant get too serious with it in work. Of course here i mean big scope for work for those <120b models.