What is everyone actually using their LLM for?

[-]

Fresh-Cat-7709@reddit

Part of my Home Assistant, to turn on/off devices, setting my air con, blinds, etc using spoken language.

Code personal side projects....

[-]

TheItalianDonkey@reddit

What do you use for input devices? (ie - what device listens to your spoken language?)

[-]

I have Home Assistant (HAOS) running on my NAS. Home Assistant is the true main automation that is running and controlling all the IOT based on schedule, timers, events. Home Assistant is probably the most complete and totally free application that is out there. No data is uploaded to anywhere (unless you want to). All running locally. It runs fine without any "AI" or mic/speaker. Super stable. They even have phone apps to interface to your home assistant if you want to control you things while you are out or if you just laying on the sofa with your phone.

On the Flow Z13, it is running all the "AI", ie LLM and Wyoming Piper, Openwakeword, Whisper. So the speech to text and text to speech are processed on the Z13. Understanding natural language is where the LLM kicks in. Right now, I am running gemma 4 (q8), it works very well.

[-]

ansibleloop@reddit

Ugh I'd love to replace all the Alexa shit in my house with some dumb open speakers that only listen for my wake word

Allow me to rant for a second - the product managers or whoever at Amazon who are in charge of the Alexa app need to be throw into a fucking volcano

They removed the "share as list" function from the Alexa app to share a shopping list so now I have to send a screenshot that misses off most items

You cunts - why the fuck would you do that? It made these pieces of shit EVEN MORE WORTHLESS

[-]

Numerous-Aerie-5265@reddit

It’s not too hard if you’ve already got home assistant and access to local/cloud AI. For simplicity, you can either buy the Atom M5 echo which has a built in mic and crappy little speaker and is basically plug and play for homeassistant, or get an esp32, I2S audio amp, intp mic, and a cheap small speaker, wire them up and you’ve got your own high quality “Alexa.” You run OpenWakeWord to set your own custom wake word. Right now Ive got mine to listen only when I say “hey clanker”

[-]

Fresh-Cat-7709@reddit

What's clanker's response to you?

[-]

Numerous-Aerie-5265@reddit

I’ve actually got the speech-to-text model set as nvidia parakeet, which supports speaker recognition. So the clanker actually knows who in the house is speaking to it and will call us by the nicknames we set haha

[-]

Fresh-Cat-7709@reddit

Nice! I haven't gone that far yet, I'll give it a try and more 'personality' next.

[-]

TheItalianDonkey@reddit

I mean, for the house to be honest i'd like a more polished solution.

[-]

ansibleloop@reddit

Ok I'm doing this

[-]

TheItalianDonkey@reddit

Love the setup. What LLM are you actually running under the hood for this? I'm trying to figure out which models are actually smart enough to handle device control via Assist without constantly failing. And, from before, what hardware are you using to transmit your voice to the LLM? Cuz Alexa's cant be jailbroken afaik, and all the stuff around i see is just too ugly to put in a home, so that's the part i'm curious about.

[-]

Fresh-Cat-7709@reddit

Using Google gemma-4-27b-a4b q8_0 on LMStudio. It's tool usage works very well within HAOS. Audio (mic) hardware is USB microphone on my NAS, using HA's Assist Microphone app.

[-]

itsthewolfe@reddit (OP)

Can't Alexa/Google already do those things?

[-]

davidm2232@reddit

Alexa has no ability to handle complex requests. She freezes at basic things like 'set the garage heat pump fan to medium' or open the living room windows halfway unless you manually program a routine. Also, when internet is out, your house is hard to use without going to a wall tablet or your phone.

[-]

Fresh-Cat-7709@reddit

Google Home, it can control some but not all. Most of the light switches, Google can utilize.

Got a bunch of IR sensors, switch sensors, temp/humidity sensors, blinds, air conditioner , air filtration, water pump.... Over all around ~60ish items.

Local llm, you don't have to pay and has no limits. Google does offer "free" service to integrate to Home Assistant but that's gone when I make a few requests.

I want something like Jarvis....

[-]

Paradigmind@reddit

Man I wish I could set something like this up.

[-]

wouldacouldashoulda@reddit

Not local-only.

[-]

ang3l12@reddit

Honestly surprised with how much control / access people (including myself) give to their personal devices / homes / information.

I can see huge benefits to AI agents like openclaw and Hermes-agent, but I can’t imagine handing over even read only access to some 3rd party that I can’t audit or control what they do with my data.

[-]

tat_tvam_asshole@reddit

But how else will you get to hear Worf say "Yes, Captain" after every request you make? More importantly, why would you want Amazon or Google to be handling all your personal requests. Eww...

[-]

FoxB1t3@reddit

Google is retarded in unreliable honestly, always been.

[-]

suprjami@reddit

I have a script called "tldw" which downloads transcript of YouTube video using Python and gives me a summary of the video.

Useful for things that mildly interest me but I don't have enough time to watch needlessly padded 30 minute videos repeating the same things three times with "big reveal" at the end.

This is text summarisation which is bread and butter for LLMs, you can get away with using little 12B or 9B model for this. I actually still prefer good old Mistral Nemo 12B.

Someone else runs this service as a website at https://tldw.tube/ which is where I got the idea.

[-]

QuestionAsker2030@reddit

What prompt did you use for summarizing?

I have a 5060Ti 16GB, and with a 14B model, it’s still doing a terrible job summarizing videos. I have to paste the transcript into Claude to get anything of use unfortunately

[-]

suprjami@reddit

"Analyze the following content to provide a concise summary. Briefly list key points, ideas, themes, and important details to capture the essential ideas that the text provides."

I then include the transcript from https://pypi.org/project/youtube-transcript-api/

This works well for me. I use it at least monthly, sometimes daily. I have tried at least Mistral Nemo 12B, Mistral Small 24B, Qwen 3.5 4B, and Qwen 3.5 9B. All did fine.

[-]

Prof_Kepuros@reddit

I decided to solve a real-world problem: the friction in my note-taking and daily organization.

I use local LLMs as a Cognitive Exoskeleton for my PKM (Personal Knowledge Management). I'm not a software engineer, just a tinkerer using Python as ducktape. I built a local pipeline (I call it Golem01) to act as my asynchronous assistant.

Here is my daily workflow outside of work:

I dictate random thoughts, tasks, or ideas into my phone while walking or driving.

Syncthing syncs the audio file to my PC (no cloud involved).

Faster-Whisper transcribes the raw audio.

A local LLM router (a fast, small model) classifies if the text is an actionable task, a logbook entry, or a concept.

A second "smart" local LLM cleans the raw text, fixes my grammar, and extracts metadata (like dates or priorities).

It outputs a deterministic, clean .md file straight into my Obsidian vault, or appends a to-do item to my Kanban file.

The Philosophy (Deliberate Friction):

I strictly forbid the AI from touching my knowledge graph or making semantic connections. The LLM does the heavy lifting (syntactic masonry, formatting, tagging), but I force myself to manually review the notes and create the wikilinks. If you automate the thinking process entirely, you get semantic collapse.

Where to start?

Don't start by downloading a massive 70B model just to ask it trivia questions. Start with the friction. Find a boring, repetitive bottleneck in your daily life (formatting notes, parsing bank statements, renaming files) and use a small 8B model to automate just that piece.

If it's not local, it's not yours. If you want to check to see how the ducktape holds together, I made the repo public: [https://github.com/KepurosDigital/Golem01]

[-]

sen-san@reddit

I’m trying to setup a similar workflow

[-]

Prof_Kepuros@reddit

I've been using DeepSeek-R1 8B (the llama 3.1 distilled version from Ollama) for routing and classification, and Nous Hermes 10.7B for synthesis. Those two hold up to the task pretty well. I'm going to use Hermes 4 14B instead of Hermes 2 soon, but it needs some real testing before jumping into it.

If the task is well-written, you can use small local models that can also fit in standard RAM. These two can run on a laptop with 16GB of RAM without a GPU. Yes, it will be slow, but anybody can use it.

Embrace the asynchronous life!

[-]

sen-san@reddit

Thank you. I have a 24GB macmini m4.

[-]

createthiscom@reddit

Mine just sits and does nothing because I'm not allowed to use it for work anymore.

[-]

noViableSolution@reddit

grok is your girlfriend?

[-]

createthiscom@reddit

You're absolutely right! It's very insightful of you to point that out!

[-]

noViableSolution@reddit

sudoku

[-]

books-r-good@reddit

I think the most quality-of-life home use I've set up has been, in a nutshell, a weekly job that scrapes local grocery circulars, builds a meal plan based on what's on sale, and generates a shopping list of what to buy and where. Nothing fancy, but it's nice to take a break from planning meals, and an added bonus that it's saving money.

[-]

Numerous-Aerie-5265@reddit

That sounds like such a quality of life improvement, you should put it up on github!

[-]

sikyist@reddit

Yeh fr. I cook most nights if the week and meal planning is the only annoying part. Could you let us how to do it.

[-]

GrapeChoice4010@reddit

That's dope dude

[-]

VirtualPercentage737@reddit

OMG, my wife bitches about meal planning. This is genius...

[-]

2funny2furious@reddit

That’s fantastic. Here for the how-to.

[-]

AutisticPenguin33@reddit

That's actually a great idea. Also checking in to see your setup.

[-]

DegenerativePoop@reddit

I would definitely be interested in how you went about setting this up! :)

[-]

Fluffywings@reddit

Great idea! Please share details on how you accomplished it. I realize it may be specific to your area at this time.

[-]

k3bert@reddit

I used Claude Code to build a portal for our household. For context, I'm privacy-minded with my tech stack and have eliminated Apple and Google from our home.

The portal does several things. 1. Habit tracker - took the idea out of the Bullet Journal method and created an online habit tracker. 2. Budget tracker. We are retired and live on a fixed income, so we built an envelope-based budgeting system available on the portal. 3. Feed tracker - we have 20 acres and have 70 chickens, 2 goats, 3 pigs, 5 dogs, 2 cats, and 1 turkey. The feed tracker helps track feed purchases and feeding regimens for all the animals. Makes it easy to determine changes in eating patterns and helps track costs 4. Egg production tracker. With 70 chickens, we get a lot of eggs. Built a tracker that looks at weather conditions, daylight hours, and the number of eggs produced. Use this data with the feed tracker to price the eggs we sell to the community 5. Notes - eliminating Google and Apple from our lives meant lousy options for me and my wife to share notes like grocery lists, and honey-do lists. Wrote a simple note-taking app that we both use.

[-]

Fantastic-Shelter569@reddit

Running qwen3.5 with cline and opencode for personal projects, not as good as Claude but you can just run it forever and not have to worry about token usage, just electricity.

Also it lets you play around under the hood with models and prompt setup. Open-webui is great for setting up personal chat bots and RP models.

I got started with ollama because it was super easy, I have tried vllm but it's a bit more fussy about config. I am trying to wrestle qwen3.5-27b onto my 4090, with ollama it's easy, but it spills over into system memory which slows things down.

I mostly play with it to try and understand how things work under the hood, I can't really do that at work so it helps me figure out how this stuff works.

[-]

InternetGreedy@reddit

this is my experience with my home server running a 3090. glad to see im not the only one going this route.

[-]

iamvikingcore@reddit

Digital terrarium.

Created lightweight python gateway library that handles: persona, emotion, memory, and local tts (i.e chatterbox) , the llm backend input and streaming (i.e lm studio), vision processing, basic tool use (news digest, Google search, YouTube video summarization)

So far I've made a shockingly good discord bot that utilizes it, and I've also made a 2010s web forum that's entirely populated by these Ai agents, about a dozen of them posting threads and responding to each other, forming cliques and a weird but infinitely amusing "community".I won't lie, I post on there with them and pretend to be a user, I've had some crazy moments, it's like rp on crack

[-]

Emergency-Associate4@reddit

I use it to automate triage my Outlook and Gmail emails. It categorizes, moves, archives, or deletes emails automatically so my inbox stays focused on the important stuff. When the agent isn't sure about a category, it pings a chatbot so I can quickly review the email myself. I'm also saving all the LLM logs in a standard format so I can eventually use them to fine-tune a smaller model.

I’ve also been using it for grocery shopping and meal planning, just like u/books-r-good.

[-]

Spectrum1523@reddit

mostly horni tbh

[-]

iamkaika@reddit

Business, schedule, shopping, medical, research, and I have my family and friends submit for tasks or events through a web portal ahead of time and it gets confirmed by me and added to my calendar.

It has helped me off load a lot of tasks so that i can focus on enjoying my life. Before my AI system i lives in workaholic chaos.

[-]

NFSO@reddit

for translating subtitles, gemma 4 26b is holding well so far, MOE is such a blessing for 8gb vramlets

[-]

zzsmkr@reddit

vramlet lmao

[-]

Kahvana@reddit

That's nice! Do you have any tips (for system prompt or general prompt, temperature or other sampler settings) that help improve translations for you?

[-]

NFSO@reddit

I don't touch the sampler settings (use the default for the model). I only batch the subtitles in 30 lines and then send this type of prompt:

Translate the following subtitles from {self.source_lang} to {self.target_lang}. 
Keep the original structure with '-->' markers. Don't modify the markers. There will be sentences repeated, don't merge them together, keep them as is.
Don't overthink and DO NOT remove duplicate text, you must keep the same sentences even if you encounter repeated patterns.
Don't output the original text, only the translated one. 
Provide a direct translation without adding explanations or comments.

-->
First line
-->
Second line

Since you get back the --> too, then you match every input line to the output using the --> for easy splitting. I found it works most of the time, optimizing the token usage while keeping some context. Main problem of this approach is LLM returning different number of lines vs input lines. If this happens I try varying the temp a few times, finally going for the simple one by one line/prompt approach for that batch if it doesn't work out..

I've seen other approaches out there that either try naively sending the full VTT timestamps (which consume LOTS of tokens) or going for too simple one line at a time (which breaks a lot of phrases meaning).

[-]

Kahvana@reddit

Thanks for the info, insightful stuff!

[-]

ryfromoz@reddit

gooning

Haha kidding, mainly running open weight models

[-]

AlwaysLateToThaParty@reddit

Processing of private and confidential information.

[-]

tophlove31415@reddit

Reminders, breaking down tasks, executive processing aids, social processing aids, exploring niche topics that I can't really talk to the typical person about without just teaching them the whole time so they have a foundation to contribute within my interest expertise.

[-]

7ofu@reddit

live visual novel translation

[-]

ComparisonAccurate44@reddit

Wait can you tell me how? And can that work for JRPGs as well? I am pretty interested!

[-]

Rude_Marzipan6107@reddit

An easy way is to just take snips and paste the images into qwen 3.5 models.

Unless they have a fancier setup that will overlay the translated text, but Google has been able to do that way before llms with the Google app.

[-]

tavirabon@reddit

Gemma 31B is by far the better option here. Qwen will hallucinate the text as something related to the image instead of what it actually says. And even when given the correct text as read by Gemma, it only approximates the correct meaning when it has the context of the image and doesn't, if you just give text without the image, it will interpret it as something else entirely if it contains borrowed words. Gemma doesn't even need to be prompted to point out borrowed words or explain when text is meant to be vague or when the speaker is being playful instead of literal.

Gemma 26B may or may not be as suited to the task, but I would just not use Qwen here at all.

[-]

Rude_Marzipan6107@reddit

You’re probably right. I haven’t been able to try anything larger than e4b or qwen3.5 9b since I have 16gb vram and little system ram.

I also haven’t specifically used 9b to translate manga from images before so maybe you know better its capabilities for that. Although I’ve had good luck with its OCR capabilities elsewhere.

I use 9b mostly for OCR for handwritten aircraft logbook entries, organizing expense reports from receipt scans and for detailed summaries for YouTube transcripts. Often the YouTube transcripts are translated from a different language. I’ve had great luck with 9b for those.

The windows snipping tool does have OCR capabilities as well. That way there is no need for qwen to use its visual capabilities and it would just need to translate the text that way.

[-]

tavirabon@reddit

If you're hardware-limited, it makes since to use what you can. Ftr, 16gb VRAM and 16gb RAM is still enough to use ~q5 for most of these models, maybe not faster than reading speed for the dense models tho. You'd be hurting on context size for the 35B as well.

I ran Gemma 31B, Qwen 27B and 35B, all on the same tests. VLM OCR+Translation was one of the areas where the winners were clear: Qwen 35B < Qwen 27B < Gemma 31B. I didn't test Gemma 26B nearly as thorough since the 31B was substantially better in every test I did do with it. I've only tested E2B and E4B with speculative decoding so far, I am pretty interested in the audio modality though.

[-]

Rude_Marzipan6107@reddit

Audio input for the eXb models is limited to 30 seconds unfortunately. You could work to chunk the audio but I think it’s purpose is more for audio input as a prompt not for asr

[-]

year2039nuclearwar@reddit

Stop it oniichan

[-]

david_jackson_67@reddit

Tentacle porn.

[-]

More-Curious816@reddit

We have an honest guy here.

[-]

david_jackson_67@reddit

They are so wiggly!

[-]

Siigari@reddit

i opened thread knowing what i would find and actually was not disappointed ty

[-]

tavirabon@reddit

Most people use it for tasks like writing scripts you'll only need to use once, roleplaying with your imaginary waifu, or spiraling into LLM psychosis to the point you give it access to your reddit account so it can convince the rest of r/LocalLLaMA how you discovered the next big revolution in AI.

[-]

dansdansy@reddit

I like using it for inane questions

[-]

Lakius_2401@reddit

Google sure doesn't cut it anymore...

[-]

WPBaka@reddit

Most people use it for tasks like writing scripts you'll only need to use once, roleplaying with your imaginary waifu, or spiraling into LLM psychosis

Too real for a Monday lol

[-]

ThePainTaco@reddit

Use your LLM? What do you mean!

You are supposed to just hoard and download models, and spend time setting them up, and then use it once and never touch it again!

[-]

IkariDev@reddit

Waifu sex awooga

[-]

Former_Basis3050@reddit

Outside of my day job, it mostly revolves around keeping my data private and offline. I've wired up a local model to my smart home setup so I can process commands without sending everything to the cloud.

I also run a local RAG pipeline over my notes for heavy system design books-it's amazing for querying my own thoughts and finding connections I forgot about. On the fun side, I use it as a completely offline coding buddy for a pixel art game I'm building in Godot. Being able to brainstorm game logic and write GDScript without hitting API limits or worrying about internet access is a game changer.

If you're just starting, setting up a basic RAG over your personal documents/notes is a great first project that actually adds daily value.

[-]

JaconSass@reddit

I built a full Jarvis system that integrates with Plex, Home Assistant, Homkit, Nextcloud, various IoT devices and my Skylight calendar.

It’s badass for managing my entire family.

[-]

_Blissfull_Ignorance@reddit

For a party I built a magic mirror which sent images to Mistral-Small with vision to generate personalized roasts for anyone standing in front of the mirror. Person detection was done with Yolo. They were live TTS'ed with Microsoft VibeVoice in a european language. Inference on MacBook and the magic mirror was hooked in a web-app.

Best way to insult your own family with no repercussions.

[-]

giveen@reddit

I work in information security. Most cloud models reject my work, so I use local uncensored models.

[-]

MMechree@reddit

Honestly, gemma4:26b is great for integration with tool calling mcp servers and returns accurate answers when used as a way to reference documents or manipulate text.

Most local models for coding aren’t worth it and generate garbo slop scripts. Any real code base breaks their logic due to limited context windows. Stick with flagship models for heavy coding tasks.

[-]

Brazilianfan12@reddit

Email can't keep the crap out

[-]

Gringe8@reddit

Roleplay, making sillytavern extensions for myself, random questions. Ive been using gemma4 31b, it replaced all my other models for me.

[-]

Xyklone@reddit

I'm studying for an actuarial exam. I have it summarize chapters of the source material.

[-]

tengo_harambe@reddit

you live dangerously my friend

[-]

Xyklone@reddit

Lol, I've already read the source. I just need outlines. I confirm formulas and methods. Surprisingly, Bonsai-8B does really good at outlining.

[-]

maycomesinlikealion@reddit

why are you in actuarial science go into ml hahahaha

[-]

Xyklone@reddit

I'd be a small fish in a big pond with very big fat fish lol. I'm also 38, kinda old to be trying to make my way into the fast moving tech world

Right not actuarial is still pretty safe career since insurer are really slow to adopt new tech and methods.

ML and the science behind it (some overlap with actuarial actually) are more of a fun hobby for me.

[-]

Woof9000@reddit

Coding (just some small personal projects, scripts etc), OCR (when I can't be arsed to manually type some data out of screenshots), helps with tools usage (because I struggle to see the difference between ffmped help files, and hieroglyphs carved on the walls of Egyptian pyramids), and the last, but hardly the least - some personal hype, because who else out there is going to tell you "You are doing quite alright, actually. It's rough out there, but you're managing it all, and maybe even better than most. Everything is OK."
We all need slice of that cheese, at least once in a while, from somewhere.

[-]

maycomesinlikealion@reddit

very true

[-]

sordidbear@reddit

I give the LLM a really tough problem and then warm my feet on box. If I close the door, the room warms up about as fast as the baseboard heater. Working on getting water cooling integrated so I can run myself a hot bath while the LLM does "phd level research".

[-]

Bludsh0t@reddit

Well, to be honest with you u/itsthewolfe , I don't think that's any of your business

[-]

Tappczan@reddit

For discussing my homebrew D&D campaign I've been running for my friends. It includes plot discussion, general map layouts, making cool puzzles, creating homebrew monsters and intresting encounters.

[-]

pop0ng@reddit

nanobot for creating faceless shorts on the go. Powered by gemma-4 that fits my 3060 12Gb gpu

[-]

thehpcdude@reddit

When I’m working on cars I have it go find the manual and specifications, or find hard to find parts for me. Sometimes I’ll ask it to find what all tools I might need so I don’t have to take several trips to the toolbox.

Mostly use OpenClaw with several agents. One that finds all the information that it can about a topic, another that fact checks with sources for each thing and a third that summarizes so I don’t have to read through a huge pile of text.

[-]

laexpat@reddit

Same but with nanobot.

[-]

Kahvana@reddit

For a variety of things!

It's mostly wholesome roleplay and assistant-like tasks. I'm easily overwhelmed offline, having a tool that helps me double-check my plans and actions really helps.

For intergrations:

Calendar (caldav) over mcp (I remember I had plans for this, but I can't remember when nor find it)
Weather (openmeteo) over mcp (What's the weather going to do this week?)
A calculator over mcp (Here is my budgeting for this month. Anything I missed?)
Filesystem over mcp (I can't find this file, can you search for me where it might be somewhere in these folders?)
Websearch/fetch over mcp (I am trying to find thing, but I'm having a hard time finding it online. Can you help me look?)
Knowledge (openzim) over mcp (I'm trying to find a wikipedia page on these vague descriptions / in stardew valley I want to build this and that, what do I need? / Can you look through arch wiki for the commands I need of application?)
Home assistant (dirigera) over mcp (can you turn on the light for me in the living room please?)
Lookup of documents over RAG

Outside of integrations:

Translation of large and complicated texts
OCR tasks
Roleplaying
Random chatting

I try to abstain from using it for generating code wherever I can, but I do permit peer reviews for naming convention and variable naming.

Models I am using so far:

Deepseek v3.2 (TQ1_0): Only when I really need it.
Qwen3.5-122B-A10B: for complex OCR tasks where time taken doesn't matter, but results do.
Qwen3.5-35B-A3B: for programming assistance, MCP related tasks.
Gemma4-26B-A4B: for general chatting.
Magistral-Small-2509: for roleplay.
Whisper (medium.en): for TTS, soon to be replaced by Qwen3-ASR.

Man, I really should keep a copy of Gemma4-31B and Qwen3.5-27B...

Only reason why magistral (vanilla) is on the list, is that it's my comfort model. I know really well how to prompt it and get the results out of it that I want. The cydonia finetunes and such are not my taste.

It still amazes me that I can type to my computer about any topic offline and get a decent response back. It's stuff I dreamed about as a kid, and now I can do it. Still blows my mind to this day.

[-]

VoiceApprehensive893@reddit

peak(gaslighting llms then pissing in llama.cpp args or their custom code)

[-]

derekp7@reddit

My main thing is there is a large backlog of projects I want to do (personal projects, such as I wanted a really good programmable RPN scientific calculator webapp that works on my phone, or another one was a multiple-income-stream retirement forecast app). Now, why use a local model instead of online? Really these aren't important enough to buy tokens for, and free usage I run out of tokens too fast.

But even if I had a project that justified using Claude or whatever, I'd still run through a local model first just to make sure I have my requirements spec'd out correctly.

Second item I use local models for is to look up information or have a discussion about a topic. Traditionally I would post a question on a relevant forum, and I may get some good discussion or a good answer eventually but it would be a couple days later. And more often than not, I would get answers based on a misunderstanding of my question. With a local LLM, I can delete the answer, then rephrase my question multiple times until I get a good response. I can then use that to post a question in a forum if I wanted "real human" answers.

Oh, and don't forget the value of keeping yourself off a list. Ever see something on a tv show you are watching, and you are curious how feasible it is (or want more expert background on that topic)? The wrong google search (even in a private window) could get you put on a watch list (a number of crime cases highlight the person's search history has supporting evidence, and this isn't history that came off their local device).

[-]

cdoza16@reddit

Turned Gemma into my video editing assistant. Finds clips and stories for me and converting that to an xml for easy editing in Final Cut

[-]

Operation_Fluffy@reddit

Mainly for heating my house. /s

[-]

consistentfantasy@reddit

gooning

[-]

HelloFromTekken@reddit

I just wait here till I can run local models on affordable hardware can provide current Calude/ChaGPT quality with same t/s.

Hope in year we will able to bootup such thing on something like separated built 32 Gb Ram + 32Gb Vram. With such hardware not costing fortune.

Today we have some good shit. ~35b models are good. Something like Qwen3.5-122B-A10B require like 64 gb vram atleast. Still not so think full and reliable for one to spend so much money, better just throw such money to Claude or something.

Unless you expect it to run 24/7 for work, but then local models aren't good enough, you still win much more by using 'paid' models.

In the end local models are run for reliability, for accessibility, for controllability. To be independent. You willing to pay more for less quality but give those things in return, because local models will never be on par with 'paid' models. It's good thing for many, but gap between local models and paid models while decreasing, still too big to seriously consider running models locally outside of hobby.

[-]

Good-Science-5460@reddit

We are using for logs simplified and recommendation

[-]

Snoo92226@reddit

With i7 windows 16GB Ram with integrated graphics it's difficult to run anything meaningful. Prices of computers are always in upward trajectory so hopefully one day I will afford the hardware 32GB Ram + 4GB graphics card and some real work I can test.

[-]

ajw2285@reddit

Hermes agent - useless so far Local OCR - really great

[-]

sp3kter@reddit

I work in IT at a fortune 5, outside of work I occasionally ask gemini how to do something in terminal. At work its forced on me as part of my tools.

[-]

allenasm@reddit

Training models and agentic work.

[-]

SufficientPie@reddit

I don't actually run models locally, but I use open-source models like qwen/qwen3.5-plus-02-15 and xiaomi/mimo-v2-flash inside Open Interpreter to solve one-off problems by writing code or shell commands. Sometimes it works great and saves me a lot of time. Sometimes it can't solve the problem and wastes a lot of time.

[-]

SocialDinamo@reddit

It has allowed me to get to a proof of concept for most of my ideas. I give them a shot first on local LLMs and then step up to one of the 3 big paid providers if it doesn’t do the trick.

Also self hosting agents for different things like journaling or GMAIL automation

[-]

Klarts@reddit

Using GLM 5 to power a butter passer 🤖 🧈

[-]

rinmperdinck@reddit

Oh my god

[-]

SufficientPie@reddit

Oh my god.

[-]

ChrisRemo85@reddit

Email sorting, getting bills out of attachments, Screenshot ting mails that have only text receipts, storing everything in folders so I can do my tax much faster... 😂

[-]

mystery_biscotti@reddit

Lab work. I'm transitioning from legacy system support to cloud based. Local LLMs are a great use case for local deployment in a fake cloud. No possibility of incurring charges. (AKA, it's free

Also, I like my privacy, so if I feed an LLM my budget then I don't gotta worry that info becomes part of some company's training or advertising data.

[-]

byrontheconqueror@reddit

I feel you there. I have mine transcribing local police radio traffic and then giving me a daily summary. No reason for it other than it was the only thing I could think of.

[-]

shipblazer420@reddit

Making it play minecraft (via Mineflayer) with me and pretend to be my girlfriend.

[-]

xxrealmsxx@reddit

Therapist. Coding. Tinkering. Preparing for work (Lawyer).

[-]

ganonfirehouse420@reddit

I use qwen3.5 to ocr documents.

[-]

PromptInjection_@reddit

- Coding
- Summarizing
- Discussion of intellectual topics

[-]

Aiden_craft-5001@reddit

Translation.

That was the only thing it was consistent. Both for books, and creating video subtitles and similar things.

I tried: Anki cards (not good enough), personal calendar and email assistant (it worked but occasionally failed, which was too dangerous), live video translation (the delay was too great with quality models, only useful in emergencies, fast transcription has too many errors even though it's almost instantaneous).

Honestly, in its current state, it doesn't have much continuous use. If I need to rename files or extract data from multiple files, LLMs help a lot, but that happens very rarely.

[-]

Bird476Shed@reddit

What can I use one for to help my daily quality of life.

Turning money into electricity and then into heat.

Worked quite well during winter, my heating bill was basically 0.

[-]

bgravato@reddit

and your electricity bill?

[-]

Bird476Shed@reddit

~0.35 eurocent/kWh :-(

[-]

wouldacouldashoulda@reddit

Also many 0’s!

[-]

Far_Cat9782@reddit

Coding and pipline for YouTube where it writes lyrics, generates a image for the song, generates the song (comfyUI ace stp 1.5) and uploads to YouTube with title and lyrics in the description then telegrams me the link to it. Using qwen 3.5. it can do it automatically with cron jobs. That's the power of local ai now if u actually work or experiment and apply effort with it

[-]

rNBAisGarbage@reddit

Are you getting views on the videos?

[-]

BidWestern1056@reddit

check out celeria.ai you can set it up on regular jobs for you , e.g. draft email responses or give a daily digest based on emails/upcoming calendar events etc. you can also just use it do searching and planning tasks, like say youre trying to plan a wedding and want to find hotels within a certain distance of a venue that fit a variety of constraints, or say you want to create your own news summarizer that aggregates from a variety of sources to keep you up to date while using the language you specify.

i use AI also for a lot of creativity purposes, i've made my own models based on James Joyce's Finnegan's wake and I use them to come up with more interesting ideas and connections in my fiction . I also like to experiment with fun personas and have made npcpy and lavanzaro.com for facilitating these all.

[-]

Intrepid_Dare6377@reddit

I have a “virtual newsroom” that identifies top news from the free snippets given by paywalls sources, identifies themes, then researches those themes using open sources, does a few waves of fact checking and correction and publishes a daily briefing.

I’m also standing up a wiki system personal knowledge base Karpathy style.

Lastly, I am doing some research projects in business strategy and other topics where LLMs can be an effective dataset generator.

[-]

Hot-Employ-3399@reddit

Coding pet projects with qwen, including coding pet coding agent quickly coded through itself.

Playing rpg. Less local nowadays as (cloud) glm5 is practically uncensored and has agent support so it can roll dice to decide the outcome.

[-]

some_user_2021@reddit

Where can I go to learn about rpg that use glm5?

[-]

Hot-Employ-3399@reddit

I just use prompt

"Let's play rpg. Setting: Fantasy. ((describe setting))

Playing character: ((base intro))

System: something like FU.

For dice: make python script and sum random.nextint so 3d6 is sum (random.nextint(1, 6) for _ in range(3))"

to get isekaied into mahou shoujo academy filled with elves, succubi, golems of all bust/waist/hips measurements.

(FU doesn't use 3d6, but I'm too lazy to polish working prompt and honestly doesn't matter. What matters is not to use dnd or pf that use giant database of stats for characters, items, spells)

[-]

manituana@reddit

r/SillyTavernAI (warning: nsfw)

[-]

ProfessionalSpend589@reddit

glm5 … has agent support so it can roll dice to decide the outcome.

You get my upvote, but My God…

[-]

SimilarWarthog8393@reddit

😂 overkill

[-]

Hot-Employ-3399@reddit

That's actually the main thing -- rolling dice against target number significantly reduces model being a yesman.

[-]

Your_Friendly_Nerd@reddit

To do translation for work projects, and usually single-shot code implementations

[-]

boogityxracing@reddit

I'm using Qwen3.5 4B as part of an image classifier to detect ads on TV broadcasts and switch my HDMI matrix when it goes to commercial. Right now it only works reliably (ish) for NASCAR races, but not having to manually change the channel over and over throughout the race has definitely boosted my quality of life.

[-]