What frontend do you guys use?

[-]

Napster3301@reddit

the "asks for frontend, gets coding agents" replies are the real answer. local llm community has collapsed to coding agents because thats the only space where local measurably wins vs frontier apis.

genuine question: anyone here actually daily-driving a local model for non-coding chat work? not "tried it once" but replaced chatgpt for general use?

[-]

Kapper_Bear@reddit

That's basically all I do with LLMs and I use LM Studio for that. It's simple enough for my simple uses. I have enough complexity running ComfyUI for image stuff. :)

[-]

meth_priest@reddit

figured id pitch in with a project ive been working on for past 4 months.

been building a complex local RAG tool focused on code and document understanding. Basically as user you point it at any folder containing source code, docs, mixed repos — and it indexes and LLM retrieves across it. you pick whatever model you have running locally and it adapts around it (auto-detect based on model+hardware+kb).

hybrid search with a reranker on top, indexing rates each file based on your hardware so it doesnt just choke on larger codebases. supports 20+ programming languages and most common document formats so far. mostly built it because existing tools made too many assumptions about what kind of content you were throwing at them or what hardware you were running. this one tries not to.

still finishing it up — will post on this sub once the website and beta launch

[-]

Ill_Barber8709@reddit

I use Claude Code with Paul Hudson's agent and skills with Qwen3.6 27b because it works very well for me. I get the "MCP are useless, model knows how to do that stuff already" idea behind Pi, but I prefer good ol' deterministic code. And I don't think bothering on a few thousands of agentic tokens is worth the hassle if it means you have to give all of your trust to the model's capabilities.

[-]

Borkato@reddit (OP)

Link to the agent? 👀 I tried googling but didn’t see much

[-]

Ill_Barber8709@reddit

https://github.com/twostraws/Swift-Agent-Skills

https://github.com/twostraws/SwiftUI-Agent-Skill

[-]

TechnoByte_@reddit

Llama-server is more than enough tbh.

Fast, lightweight, built-in to llama.cpp.

I used Open WebUI before but I find it too bloated

[-]

a_beautiful_rhind@reddit

asks for frontend

People spam all coding agents.

[-]

javasux@reddit

And what do you think agents are if not frontends?

[-]

relmny@reddit

that's, I guess, because many ppl here relate anything about LLMs to code.

[-]

LORD_CMDR_INTERNET@reddit

And so your contribution is?

Those tools are multimodal and can do any task, coding or otherwise

If you're thinking more like Claude Cowork then there's Eigent but what else?

[-]

ab2377@reddit

look at those down votes man! proves for most people in this sub most words are just jagons with no understanding hah! i will upvote this.

[-]

LORD_CMDR_INTERNET@reddit

lol who knows with reddit sometimes, I didn't even say anything controversial. guess i'd rather be right and downvoted than wrong and upvoted

[-]

a_beautiful_rhind@reddit

To me things like openwebui, sillytavern, mikupad are front ends. Agents are agents.

[-]

Awkward-Customer@reddit

OP is talking about using vim as their front end. Coding harnesses like Claude code definitely count in that context

[-]

a_beautiful_rhind@reddit

Sounds most like mikupad, tbh. Or they are doing FiM?

[-]

Borkato@reddit (OP)

Mikupad is actually the closest ngl, but when I wrote this post I was asking about actual frontends like sillytavern lol

[-]

LORD_CMDR_INTERNET@reddit

"front end" != "gui"

[-]

Stunning_Inside5182@reddit

I use Pi for my coding and then use BoltAI on Mac since that is my computer I use most of the time and the app is really polished and stable. I ran openwebui for around 6 months but personally I had issues with it being quite unstable and chats dissapearing and things. Still need to find another front end UI that can connect and be used with my iPhone.

[-]

daniel_nguyenx@reddit

Thanks for using BoltAI. FYI you can use the BoltAI mobile app for free while it's in beta, I'm going to release a really big update soon (same feature parity with Mac app including remote MCP servers). Try it here https://boltai.com/docs/boltai-mobile/getting-started

Thanks

[-]

Stunning_Inside5182@reddit

Oh thanks I didn't even realize this I will give it a go and will get back to you. For the MacOS app are there any plans to add support for searxng for local web search so I can keep everything local? Thanks alot for your work

[-]

daniel_nguyenx@reddit

Ah good idea. I think I had this feature request before but I forgot about it. I will try to add official support for it. But in the meantime, you can try to find a good MCP server for it.

[-]

Stunning_Inside5182@reddit

Thanks will do. Is there any chance you could look into adding an option to be able to choose an agent as the default for a particular project?

[-]

LORD_CMDR_INTERNET@reddit

pi

[-]

Objective-Error1223@reddit

Pi all the way, if you don't want the bare bones version just grab Little-Coder (it's designed for small models but I use it with 27B Q8 and it honestly works great).

[-]

ab2377@reddit

there never was any problems small models (or big) with pi or opencode, dont know what made up problem little coder solved.

[-]

Objective-Error1223@reddit

Dunno, I didn’t make the harness it just works. 🤷‍♂️

[-]

ab2377@reddit

and you never used pi or opencode that didnt work and you had to use little-coder?

[-]

Objective-Error1223@reddit

That’s like asking “and you’ve never used Gemma or Mistral that didn’t work and you had to use Qwen?”.

We’re in the age of evolving harnesses, models, IDEs, etc. Hell we even have different ways of running the various thousands of different models from llama.cpp, omlx, vmlx, vllm, lm studio to kobold.

What works for one person may not work for another due to their software/hardware limitations, OS uses, purpose of use, scope of the work, and finally their personal preferences.

[-]

film_man_84@reddit

LM Studio, sometimes Open Web UI, sometimes llamacpp-server web UI, sometimes Koboldcpp web ui and sometimes even Silly Tavern.

Most used ones are LM Studio and Open Web UI.

[-]

WishfulAgenda@reddit

VS Code with Continue Dev, OpenCode, LibreChat/LangFuse.

[-]

PeteInBrissie@reddit

I'm really liking LibreChat

[-]

WishfulAgenda@reddit

Danny’s done a great job with it and when paired with clickhouse it’s even better :-)

[-]

chris_0611@reddit

Frontend?!? What's that?

Raw OpenAI api calls straight from the commandline using curl

[-]

Borkato@reddit (OP)

Do you have an alias set up or anything or do you type out “{messages: [role: user, content…” every time?

[-]

ProfessionalSpend589@reddit

There are a few examples in the llama.cpp docs which you can copy from and just edit the prompt.

[-]

Alternative_Web7202@reddit

He just forgot to put /s in the end

[-]

ea_man@reddit

Come on man don't be lazy, telnet there!

[-]

ab2377@reddit

everyone's life should be this simple!

[-]

Ill_Barber8709@reddit

You do not like bloated software like Pi I see. A man of commitment.

[-]

Southern_Sun_2106@reddit

Telegram, so that I can talk with my agent from anywhere. Plus, it can generate music for me, send my any file I want (and it has access to), do long-horizon agent work (I don't need to keep telegram alive, it will just notify me). Of course, you have to be careful what you give it access to, just like with any other setup. In use Qwen3.6 35B A3B q4km gguf at the moment running locally on a dedicated machine.

[-]

the-username-is-here@reddit

Claude Code.

Tried pi and opencode - they are nice and light, but are really bare bones and need a lot of work to start using properly.

[-]

spammmmmmmmy@reddit

I made a html chat page to manage ollama api requests. Cut/paste from code blocks, copy files etc.

[-]

Tormeister@reddit

OpenWebUI for big LLM queries

OpenCode for local LLM coding

[-]

temperature_5@reddit

I'm dog fooding my own chat/agent UI, along with llama-server. So simple to make a chat client, so impossible to stop adding features... :-D

[-]

Barafu@reddit

I use Cherry Studio. When I needed an application, it was the only decent app available on FlatHub. Since that, I did not have a reason to change. Chatbox was nice too.

But I also use a lot the webpage chat of DeepSeek. Its ability to do internet search is majestic. It has almost replaced DuckDuckGo in my daily use. I wish I could set up an LLM search that good with a local LLM.

[-]

Risen_from_ash@reddit

My agent’s web search game got better by orders of magnitude when I started using Firecrawl instead of my locally hosted SearXNG. Qwen 3.6 35b a3b with Firecrawl feels like GPT 5.5. Blows my mind.

[-]

meth_priest@reddit

figured id make an account to pitch in

been building a local RAG tool focused on code and document understanding. point it at any folder like source code, docs, mixed repos — and it indexes and retrieves across it. you pick whatever model you have locally and it adapts around it. hybrid search with a reranker on top, indexing rates each file based on your hardware so it doesnt just choke on larger codebases. supports 20+ programming languages and most common document formats so far.

mostly built it because existing tools made too many assumptions about what kind of content you were throwing at them or what hardware you were running. this one tries not to. also works 100% offline - so usable for sensitive files

still finishing it up — will keep hear from me once the website and beta launch

[-]

ansmo@reddit

Also made my own from scratch. It's obviously not as polished as the big boys but it has features tailored to me and my usecases (reading, writing, tts, research, granular experimentation and pattern tracking, daily scouting reports, artifacts, image gen and editing, etc) and I know exactly how all of it works. I use it for basically everything except for coding. For coding agents, I'm on opencode and a fork of openclaude.

[-]

skryking@reddit

My own custom tui that I spec'd out for qwen to code.

[-]

bgravato@reddit

Long time vim user here...

I have now been experimenting with VSCodium + continue add-on for AI integration (+ vim add-on, to get vim shortcuts on VSC). I'm still getting used to it, but the vim add-on to replicate vim shortcuts/modes really helps in the transition.

[-]

suprjami@reddit

There is https://github.com/madox2/vim-ai if you prefer the VSCode style of chat/rewrite/fill-in.

OpenCode obeys your EDITOR environment variable if you want to drop in and out of agent mode.

[-]

o0genesis0o@reddit

I have two "frontend"

One is a custom made productivity system + workflow + agent web app that I built for myself and my wife to use. It's accessible on all of my devices via VPN and good enough for tasks that we usually use chatgpt or gemini or whatever web frontend.

The other "frontend" is Pi agent + Obsidian or neovim. This one is for local coding or knowledge base management. I wish to migrate more responsibility to the web app frontend, but it's not that convenient vs opening a terminal.

[-]

TheDapperYank@reddit

I used Claude to build my own front end with tool calling, memory, compaction, etc and use llama.cpp for the backend. Nothing against anything else already on the market, but I kind of viewed this as more of an academic exercise and wanted to spec out my own with my own feature set.

[-]

PigSlam@reddit

I’m doing this with Codex right now.

[-]

Fit_Squash6874@reddit

Lmstudio

[-]

elijahebanks@reddit

Your mum's

[-]

florinandrei@reddit

telnet localhost 8080

[-]

Pleasant-Shallot-707@reddit

Pi and Hermes

[-]

drunnells@reddit

Open WebUI for conversation, Hermes Agent CLI or dashboard for getting work done, trying to get into Open Code. For non-local I like the direction the Codex desktop app is going.

[-]

ayylmaonade@reddit

I've been using Open-WebUI in docker since like, I wanna say jan-feb 2025? That's about the time I got into local LLMs (and AI in general). It's definitely not perfect and has its issues, but I find it to be the best all arounder. It's like a customisable ChatGPT.com, sort of.

Just the ability to add my own custom tools and have a debian sandbox, memory, crons, great RAG, and the ability to add any openAI style endpoint (openrouter, etc) makes it better than everything else out there for my use cases.

I do use Hermes Agent alongside it though. I have open-webui setup in firefox as the sidebar AI chatbot, so it's just convenient.

[-]

fallingdowndizzyvr@reddit

Llama-cli.

[-]

Miriel_z@reddit

I use my own gui made in Python. Not sure if it is the best way, but it gives me the control.

[-]

Some-Cauliflower4902@reddit

Same here. Build the tools as I need it. It’s more lean and efficient.

[-]

Borkato@reddit (OP)

I did the same!

[-]

Miriel_z@reddit

Well then, our idea is not so stupid it looks like🤣

[-]

philmarcracken@reddit

Late cli has done more work with less headaches for me. little coder is faster and likes it rough.

I'm considering my own using Pi and copying aspects of little coder(model swapping) and late(subagent splitting, plan first). Both have annoyances

Smallcoder just shits the bed for me

[-]

Hot-Employ-3399@reddit

From agents I just switch between anything now. After a month it feels all of them can be very bad.

Front-end chat, I use rarely and then it's usually cloud models. Usually several at the same time(Glm, Gemini, chatgpt).

When it's not cloud, it's alt tab to current agent and question is asked to it.

[-]

kevin_1994@reddit

Open webui still good. I have no reason to change it.

[-]

Alan_Silva_TI@reddit

I mostly use CLI harnesses like PI or Hermes agent.

I also have some self-made tools that connect to the server with their own purposes like brainstorming or text correction.

[-]

rmhubbert@reddit

For coding, I use Neovim with Opencode as the backend via https://github.com/sudo-tee/opencode.nvim for agentic & interactive coding, and https://github.com/cursortab/cursortab.nvim for code completion.

For general chat, Open WebUI.

[-]

LeMochileiro@reddit

Since the models run on a different machine with a dedicated GPU (a node in a K8s/K3s cluster), I need more flexibility to run different models with different configurations and parameters, without having to access the machine via SSH or create containers/pods all the time.

LocalAI is what has been serving me quite well lately. It deploys different backends (like llama.cpp or Vulkan), download different Hugging Face LLMs, change parameters, check usage, and easily integrate with tools like OpenCode. Everything is done directly through the frontend.

I'm open to exploring alternatives, but it has to be in Docker/a container for me to be able to run it on Kubernetes.

[-]