Is there any all-in-one app like LM Studio, but with the option of hosting a Web UI server?

Posted by HRudy94@reddit | LocalLLaMA | View on Reddit | 57 comments

Everything's in the title.
Essentially i do like LM's Studio ease of use as it silently handles the backend server as well as the desktop app, but i'd like to have it also host a web ui server that i could use on my local network from other devices.

Nothing too fancy really, that will only be for home use and what not, i can't afford to set up a 24/7 hosting infrastructure when i could just load the LLMs when i need them on my main PC (linux).

Alternatively, an all-in-one WebUI would work too, i just don't want to launch a thousand scripts just to use my LLM.

Bonus point if it is open-source and/or has web search and other features.

[-]

versking@reddit

Anything can be anything with docker. Make a docker compose once that launches all the things. Enjoy forever.

[-]

Swift8186@reddit

running vLLM in docker seems to be pain.... what I`m looking for is some kind of "LM studio like" orchestrator running with vLLM as backend, with web gui , where I can download, delete, configure models easily etc... think might have to write one or I dont know...cunt find it enywhere..

[-]

versking@reddit

I remember reading something about openwebui adding direct Ollama support. So maybe write a compose for openwebui and Ollama?

Most things with Docker are a pain the first time around, but once you figure it out, it's figured out forever.

[-]

Swift8186@reddit

services:
  gpustack:
    image: gpustack/gpustack:latest-cuda12.8
    restart: unless-stopped
    ports: ["8080:80"]
    volumes:
      - gpustack-data:/var/lib/gpustack
      - ./models:/models
    environment:
      - GPUSTACK_CACHE_DIR=/models/gpustack
      - HF_HOME=/models/hf
      - HUGGINGFACE_HUB_CACHE=/models/hf/hub
      - TRANSFORMERS_CACHE=/models/hf/transformers
      - XDG_CACHE_HOME=/models/.cache
    gpus: "all"

volumes:
  gpustack-data:

nope, got it! final solution for home, enterprise is this one for me (right now):

[-]

versking@reddit

My only note there is to see if you can integrate “health check”s if you haven’t already. Docker compose is self-healing once you get health checks involved.

[-]

Swift8186@reddit

yes, you can add health checks, thats was not my point...I updated the compose as well so now it should work. Its pretty awsome

[-]

versking@reddit

Nice! Glad you got what you were looking for.

[-]

Rabo_McDongleberry@reddit

Don't know enough about Docker. Point me in the right direction kind person.

[-]

Swift8186@reddit

If on windows, just install docker desktop (Windows | Docker Docs) . then save this as docker compose file and run...

services:
gpustack:
image: gpustack/gpustack:latest-cuda12.8 #for 5090 , keep latest for other cards
restart: unless-stopped
ports: ["8080:80"]
volumes:
- gpustack-data:/var/lib/gpustack
- ./models:/models
environment:
- GPUSTACK_CACHE_DIR=/models/gpustack
- HF_HOME=/models/hf
- HUGGINGFACE_HUB_CACHE=/models/hf/hub
- TRANSFORMERS_CACHE=/models/hf/transformers
- XDG_CACHE_HOME=/models/.cache
gpus: "all"

volumes:
gpustack-data:

[-]

TripAndFly@reddit

Cole Medin on YouTube, his 2+ hour video released 2 days ago. AI masterclass or something. But he sets it all up in docker and it's a great setup

[-]

Rabo_McDongleberry@reddit

Awesome. Thank you. I'll check it out!

[-]

aseichter2007@reddit

Kobold.cpp is the best one. Idk how no-one said it before. It does it all.

[-]

Swift8186@reddit

"Other formats such as safetensors and pytorch.bin models are not natively supported, and must be converted to GGUF/GGML! (see below)" ...sooo, no, "it does not it all"

[-]

aseichter2007@reddit

He said for home use, and it's a single command to convert these days. I don't think kobold can convert them, though.

You're right. It satisfies the use case in this thread though.

[-]

Swift8186@reddit

gpustack/gpustack:latest


gpustack/gpustack:latest-cuda12.8 - for rtx 5090

Hi, just found best solution for home, enterprise etc... testing right now...so far its great.

[-]

tiffanytrashcan@reddit

Single exe, image gen, TTS, supports the vast majority of models, web search, world info / context.. The list goes on.

[-]

fish312@reddit

kobo is best!!!

[-]

LA_rent_Aficionado@reddit

Ease of use would defintiely be llama-server with its very simple/basic webui

[-]

daaain@reddit

LM Studio + OpenWebUI work quite well together and you can share it via a VPN like Tailscale if you want to access it from anywhere.

[-]

mitchins-au@reddit

Is it easy enough to do? Open webUI is married to Ollama of all things

[-]

daaain@reddit

Yes, OpenWebUI can even list the models you have loaded as LM Studio has OpenAI compatibile endpoints

[-]

_hephaestus@reddit

For whatever reason while my open webui instance can validate its connection to LM studio, no models appear visible in open webui, were there any configuration steps you had to do?

[-]

daaain@reddit

No, but OpenWebUI seems to cache and sometimes it takes a bit of faffing for it to rerequest the list

[-]

HRudy94@reddit (OP)

Yeah but i'd have have to launch both at once, though yeah i should start using tailscale.

[-]

kironlau@reddit

you can always set them auto launch when your OS starts, openWebUI is very light weight, LM Studio is a resting API (if no model is loaded, it just a background service)

[-]

kris32@reddit

I nade i windows script where i launch it all at the same time.

[-]

No_Conversation9561@reddit

could you please share it?

[-]

HRudy94@reddit (OP)

Yeah but you can't close them both at the same time. I think i'll make my own launcher/manager app that launches them at once and closes them when it's closed.

[-]

tcarambat@reddit

https://github.com/Mintplex-Labs/anything-llm

[-]

HRudy94@reddit (OP)

This could work but it would be 2 apps then :/

[-]

fatihmtlm@reddit

You don't have to, it can handle models itself too. Don't know about the webui feature but its a great program

[-]

HRudy94@reddit (OP)

Anything's llama.cpp is not enabled on Linux yet for some reasons, so i'd have to also launch LM Studio indeed which makes it 2 apps.

[-]

fatihmtlm@reddit

You can let ollama run on the background. It unloads model after like 5 min and shouldnt use considerable power at idle.

[-]

National_Meeting_749@reddit

Talk about being in the advertising trenches lol.

I'm really enjoying anythingLLM, especially for agents. Just wondering though, I'm trying to have it edit spreadsheets for me and sql databases are much more than I need, any idea if/when that might be implemented?

[-]

tcarambat@reddit

For editing specific sheets or SQL databases you could accomplish this with a custom skill, but that being said what specific task did you want to complete??

[-]

National_Meeting_749@reddit

So, for a creative project I'm trying to set up a template for character sheets filled with both text traits, and numerical stats, both for reference for myself, but also having linked cells with formulas that manipulate data. I need to be able to fill in, edit those templates, and have it reference the info in the sheets.

The base level excel functionality basically, but I do need the ability to format them into visually appealing ways.

I'm not a coder, like at all. I could vibe code it but.... That feels like handing a genius 6 year old power tools and having him teach me how to build a shed.

It seems like it's possible. But I've been searching and haven't found anything that works.

I've seen computer-use agents that might be able to do what I want, but I'm so close to what I need with AnythingLLM and I'd love to be able to have everything I need in one place.

[-]

roguefunction@reddit

Msty (https://msty.app/) is really good, it's free, but not fully open source. Another one is AnythingLLM. Both have a ollama backend option and have a decent interface. I prefer Msty.

[-]

BumbleSlob@reddit

Sounds like you should just use docker compose with open web UI and Ollama defined as a runnable. Open WebUI provides a mechanism to do this in their docs.

[-]

BumbleSlob@reddit

https://docs.openwebui.com/getting-started/quick-start

[-]

SM8085@reddit

llama.cpp's llama-server hosts a very basic webUI by default. It's hosted at the server root, without the API endpoint.

I have a DNS entry for my LLM rig, so I go to that address with the right llama-server port and it pops up.

[-]

ttkciar@reddit

Yep, I came here to suggest this, too.

No "thousand scripts" needed, just a single command line command, and ta-da, web UI inference.

Like you said, though, it's pretty light on the features.

[-]

10F1@reddit

Open webui + lm-studio

[-]

celsowm@reddit

How many users?

[-]

mike3run@reddit

openwebui + ollama

[-]

overand@reddit

It really sounds like your best solution might be to use e.g. Ollama and open-webui, and just make sure they're both set up to automatically launch. I think Ollama doesn't keep the models loaded in memory past a certain timeout, so it shouldn't

[-]

HRudy94@reddit (OP)

Yeah i'm thinking about making my own wrapper app just to seamlessly launch them together in a way that i can quickly launch both parts at once, akin to LM Studio and also close them at once.

Does Open-WebUI or others let you unload or switch models without having to restart the backend?

[-]

Asleep-Ratio7535@reddit

Jan.ai? GUI is nice though. I think functions are similar to lmstudio.

[-]

HRudy94@reddit (OP)

Does it also host a web UI? If so how can i access it?
I know it can host an API, but idk if it has a web UI.

[-]

Asleep-Ratio7535@reddit

What do you mean by webui? It has a wholesome GUI already..

[-]

HRudy94@reddit (OP)

Yeah i know but can it also host its GUI as a web UI so i can access my chats and stuff on other devices?

[-]

Asleep-Ratio7535@reddit

Oh, I see what your webui means now. you can. If you have another lightweight app installed. They are servers.

[-]

blurredphotos@reddit

https://msty.app/

[-]

HRudy94@reddit (OP)

Looking at it again, we're close but unfortunately they only expose the API and not a WebUI, there's no android app to use the msty remote feature unfortunately.

[-]

blurredphotos@reddit

I have used https://chatboxai.app/en on android to connect.

https://msty.studio/ If you want to use web.

There are paid options as well.

[-]

Nomski88@reddit

LM Studio has a built in server via OpenAI API.

[-]

HRudy94@reddit (OP)

yeah but that's only an API server and not a WebUI one too.

[-]

opi098514@reddit

Oobabooga