TheaterFire

Is LLM Studio good?

Posted by Top_Sonic@reddit | LocalLLaMA | View on Reddit | 91 comments

Is there any alternative software to run llms in windows

Reply to Post

91 Comments

_underlines_@reddit

A lot of alternatives: https://github.com/underlines/awesome-ml/blob/master/llm-tools.md#native-guis
View on Reddit #38184615

macheteBlade@reddit

this has a great compilation of tools, but hasn't been updated it recently, do you know other sources like this that are more up-to-date?
View on Reddit #77046427

pablogabrieldias@reddit

Lm Studio is very good. I have used it since its inception, and although I do not really agree with the new changes that were made to its interface, it is still an excellent option. What I really don't understand is how someone can use Ollama. Look, I have analyzed it from all points of view and I don't understand why someone would use a program like that.
View on Reddit #38185203

GimmePanties@reddit

Because when ollama is set up it will run in the background and launch on startup and you can just assume it’s there when your client applications need to call it.
View on Reddit #38186302

Flimsy-Tonight-6050@reddit

can't I make lm studio launch from start aswell?
View on Reddit #38190131

Impossible-Value5126@reddit

Run msconfig in search and add it to startup.
View on Reddit #71387986

GimmePanties@reddit

Probably? Look, don’t get me wrong, I love LMStudio and it is my main drive for downloading and trying new models and sometimes I’ll start it server as a quick endpoint for an app. But for stuff I want available all the time, I get set up on Ollama. It seems really smart about being able to load and unload models on demand, so with LM you’re loading the model yourself when you need it, and ejecting it when you’re done to manage memory. And if your app was connected to it, and expecting a different model, it would fail until you go into LM and manually load it.
View on Reddit #38190768

ImaginaryRea1ity@reddit

Can you run ollama on your PC or a mac and then access it on your phone on a local network?
View on Reddit #38205156

GimmePanties@reddit

Yes
View on Reddit #38219971

ramplank@reddit

Depends on your use case, I use it to spin up models and then I connect to the models with other tools and in that scenario I dont need all the fluff of lmstudio. sure I can do that with lmstudio but why have all the extra fluff when I'm up and running with 1 terminal command
View on Reddit #43854812

muxxington@reddit

The answer is simple: People don't want all-in-one software. Especially companies, teams, groups etc. don't want all-in-one software. They want to host one more models and expose them via an OpenAI compatible API and then connect clients of all kind to it instead of having a full blown gaming PC under each desk.
View on Reddit #38218550

pablogabrieldias@reddit

But Lm Studio is free. It is not open-source, but it is totally free
View on Reddit #38219181

muxxington@reddit

No. It is not even close to free. It is *for* free. At least by now and only for private non-commercial use.
View on Reddit #38225952

Eugr@reddit

Well, there are several reasons: 1. Ollama runs as a server and you can connect different GUIs and tools, such as excellent Open-WebUI. 2. Open-source vs closed source. 3. Licensing. Ollama has MIT license, LM Studio requires you to contact them if you want to use it for work purposes.
View on Reddit #38221332

first2wood@reddit

Yes, I guess it's because that's the most famous open source one because it created early. Even a lot of developers integrate ollama into their projects first. I used ollama first because it's popular. Then I tried others and settled down with LMS. 
View on Reddit #38206118

constPxl@reddit

(not proven personally and i dont have the data but) could it be that the "interface-less" console base ollama uses less system resources than lm studio? dont get me wrong, i love lm studio showing me all the settings and configs running the model guess what im saying is if i know what im doing, i'd use the least resources required app to run the model so those resources can be allocated elsewhere (tts, stt, rag etc)
View on Reddit #38190948

Ill-Total9416@reddit

That’s quite a lot, for example, LM Studio, Ollama, GPUStack, all based on llama.cpp.
View on Reddit #38181198

cantgetthistowork@reddit

Wish LMS would work properly with multiple GPUs
View on Reddit #38218875

Atari-Katana@reddit

I just wish I could afford multiple GPUs these days.
View on Reddit #65559903

visionsmemories@reddit

yes, it is yes, there is
View on Reddit #38180986

bearbarebere@reddit

I recommend trying LM studio before all others because it runs GGUFs faster than any other program I've seen (tied with ollama but ollama is annoying to set up if you're not techy). so you'll know how fast your computer can run them when it's set up right. I started with Oobabooga and just accepted the speeds of GGUFs were 4-9t/s. Nope. LM studio gets them up to **40t/s**.
View on Reddit #38184979

road-runn3r@reddit

>it runs GGUFs faster than any other program I've seen Not in my experience. Kobolt can be sometimes up to 15÷ faster. (on a 3060ti)
View on Reddit #38191729

bearbarebere@reddit

I tried kobold and I vaguely remember it being lackluster. So strange.
View on Reddit #38191832

entangledloops@reddit

Comparing LLM studio and Kobold is a fool's errand. You don't know what code is being used. One may be 15% faster, but still 80% slower than possible. You need to load the model yourself with barebones code to get the optimal estimate. Not claiming that's an easy task if you aren't an ML engineer, but it's the only correct way at this time.
View on Reddit #55965693

road-runn3r@reddit

Quite the opposite but to be honest there are a few things that you need to set up or check with Kobold. First, set GPU layers to -1 so it shows max layers then set it manually. (according to your VRAM) Go to tokens and enable flash attention. Did you get the cuda12 exe version? (if you have a newer Nvidia GPU you should) ps. don't mind the downvotes, people thought it was an opinion when you are just trying to learn
View on Reddit #38227908

CoyRogers@reddit

They might mean CPU only, in my CPU only laptop it runs way way faster then ooogabooogs does, a ton faster. Loads models instantly rather then taking forty seconds and responds so much faster
View on Reddit #38205948

road-runn3r@reddit

I don't think he's getting "40t/s" on CPU.
View on Reddit #38228236

mamba436@reddit

I agree, LM studio was faster than other like msty imo. I run on very high end hardware and my best performance was up to 110 token/s for a model on gpu (but was halved on others). Now I know these programs are frontends product so I am not saying that there are not capable of same performance. But out of the box, LM studio allowed me a faster result.
View on Reddit #49932806

d3ftcat@reddit

Used to be a pain. If you haven't seen it, you can now add most hugging face models straight from their page to Ollama. Still prefer LM Studio over Ollama, but Msty above it.
View on Reddit #38201878

bearbarebere@reddit

I have a bunch of models downloaded already and making a model file is freaking annoying compared to every other app’s “tell me what folder to use and then click a button to load the model”
View on Reddit #38219347

d3ftcat@reddit

Msty is this one https://msty.app/ not affiliated in any way.
View on Reddit #38220860

SpareIcy8308@reddit

the only downside i´ve found of msty is it´s knowledge stacks...doesn´t work right most of the times.
View on Reddit #38226835

Deluded-1b-gguf@reddit

How’s that the case? Did you forget to offload layers to the gpu?
View on Reddit #38187384

bearbarebere@reddit

No, definitely not. I offloaded all 33 layers of the model I was trying in oobabooga, and all 33 in Lm studio, and used the exact same model file - not even a copy, I literally used the same file. I believe that oobabooga installs llama.cpp differently than with all the optimizations that Lm studio does. It’s the only thing that makes sense.
View on Reddit #38187469

Deluded-1b-gguf@reddit

Hmm interesting I need to do a test myself I’m curious now
View on Reddit #38187607

bearbarebere@reddit

I would love to see your results, perhaps my setup is just weird!
View on Reddit #38187754

Zangwuz@reddit

Or maybe you are comparing two different values. When i was using ooba before, it was giving t/s based on the total time which include prompt processing while lm studio give the t/s based on the generation time only which doesn't include the prompt processing. That said, i noticed that [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) was a bit slower than normal llama.cpp but clearly not 4x less.
View on Reddit #38196879

bearbarebere@reddit

No. I’m talking about both cases where there’s no prompt to process other than “write a giant paragraph” AND cases where there’s a long context. Both do the same.
View on Reddit #38218649

Zangwuz@reddit

I think you don't get what i'm saying. Ooba display on the last line something like "Output generated in 3.63 seconds (23.99 tokens/s, 87 tokens, context 1050, seed 593086777)" if you compare this value to the one displayed on LM studio UI it's wrong. You need to check eval time on the console for both something like "eval time = 1371.41 ms / 87 runs ( 15.76 ms per token, 63.44 tokens per second)" And i just tested to confirm if it was still the case and yes it does still display the total. Btw just compared by curiosity and the performance between both is actually close, even closer than i thought it would be.
View on Reddit #38220571

bearbarebere@reddit

I didn’t want to do this, but I’m going to get up right now and show you because clearly you think I’m stupid. Brb.
View on Reddit #38220859

Zangwuz@reddit

You don't need too but this is not a normal behavior either, LM studio is not 4x faster than ooba so i was just trying to find a rational explanation.
View on Reddit #38221270

bearbarebere@reddit

You are actually correct!! Holy shit, I've never been more blatantly wrong in my life. I should check my condescension because that was absolutely not warranted. I greatly appreciate you. After running about 5 tests each, the difference is about 1.25x, so a 1 minute response in Ooba would only take about 48 seconds in LM studio. That is faster, but nowhere near as fast as the token values implied. I was wrong about everything, including being able to truly tell the difference between token speeds without actually measuring it. I gave it a long prompt, turned the temp to 0, and then told it to just repeat the prompt, and it did, and then I timed that. Really neat, again I was COMPLETELY wrong, I WAS stupid, and you were 100% right and I love you. Forgive me? :')
View on Reddit #38224157

Zangwuz@reddit

Yeah no problem at all, it's rare to see someone saying i was wrong in reddit and you were open enough to try even though you were skeptical :)
View on Reddit #38225507

bearbarebere@reddit

Bro you don’t even know. I was lowkey so mad. I was like “HOW DARE THEY THINK THAT THE HOURS I SPENT ON THIS WERE WRONG”, I just didn’t want to accept it LMAO But I try (keyword try lol) to be a good scientist. And I’m actually really happy that you corrected me. I was so mean I need to work on that 💀 I took it like a personal attack because I spent hours and hours trying to figure it out and never did. seriously thanks lol
View on Reddit #38225945

noneabove1182@reddit

I think ooba installs the llamacpp python wrapper (I could be wrong) and also often lags behind on releases, so that could also play a part
View on Reddit #38197588

a_beautiful_rhind@reddit

I would use kobold CPP in windows if I didn't want to mess with anything.
View on Reddit #38196948

Elgamer_795@reddit

memory error with amd rx 6xxx series.
View on Reddit #54802545

Pro-editor-1105@reddit

Jan.
View on Reddit #38204861

Pro-editor-1105@reddit

am i being downvoted by lmstudio devs?
View on Reddit #38241226

mamba436@reddit

I did not 'downvote' your comment, but I will take the liberty of doing so for this one since it implies a distasteful conspiracy theory and is a comment one might expect to hear at a café counter. :)
View on Reddit #49933086

Awario_time@reddit

maybe you are downvoted by lmstudio, ollama, kobold and others 😅 Sorry!
View on Reddit #38253249

TaggM@reddit

Are there any particular features that would be ideal for you? LM Studio is a gentle entry into fetching and running LLMs locally with your own docs for more recent and specialized information. It's fine for casual use. But it loses its lustre when more intensively with agentic calls and multiple sources from a large library of proprietary documents. You may want to try GPT4ALL, AnyLLM, or OpenWeb UI, but each program has its own strengths and quirks.
View on Reddit #48343503

Orlando_orchids@reddit

Surprised that nobody mentioned [https://msty.app/](https://msty.app/) yet. I started with LM Studio, also tried [jan.ai](http://jan.ai) and settled on Msty.
View on Reddit #38195237

BeYourBestVersion@reddit

I've been using msty for a few months and have generally very good experience with it. Question though: For running local models on laptop without GPU, would any other packages allow for better optimization of these models than msty could?
View on Reddit #47331952

Longjumping_Ad5434@reddit

Funny, I went jan then msty, and have landed on LM Studio, especially after the mlx support.
View on Reddit #38236621

first2wood@reddit

Tried. It's running ollama for local LLM. I hope they can run llama.cpp instead of ollama. Ollama changes model file extension to none. It's too annoying. I don't know why they do it.
View on Reddit #38205847

circlesqrd@reddit

Seconding Msty. I run one on my desktop, while connecting another Msty from my laptope. Shared workspaces are nice too, including RAG, and the detailed analytics.
View on Reddit #38202617

Pristine_Income9554@reddit

tabbyAPI, koboldcpp beckend, ST fronted
View on Reddit #38181443

maxidev0x@reddit

What's ST frontend?
View on Reddit #38184306

Historical_Scholar35@reddit

Silly tavern
View on Reddit #38184787

Possible-Basis-6623@reddit

Their site looks so......hahas I prefer streamlit but it needs all that accounts paid plan which is annoying, I hate everything nowadays trying to build extra ecosystems, can everything just plug-in-play? you want A, they need you to install B, C, D, E, F....
View on Reddit #46858313

Historical_Scholar35@reddit

You can install it via Pinokio https://pinokio.computer/
View on Reddit #46859494

maxidev0x@reddit

Thanks 👍🏼
View on Reddit #38190179

noneabove1182@reddit

While these are good, they fill an extremely different niche Where lm studio shines is the single install executable and you're up and running like a normal Windows/Mac application, no mess no fuss no barrier of entry Is it the best? No probably not, but the fact that it's extremely good while also being extremely easy makes it very valuable
View on Reddit #38197482

Pristine_Income9554@reddit

Read question of the author.
View on Reddit #38198762

noneabove1182@reddit

Yeah sure I'm just pointing it out for others, and if they started with LM Studio and they're on windows, they're *likely* not the kind of user to figure out these more difficult tools, as great as they are they're just too different and too difficult for the average user No offense to anyone who uses lm studio of course or who is an "average user"
View on Reddit #38200624

Salt-Bread4114@reddit

Yeah I think it’s pretty good I added a tutorial to run LM Studio local models on any chatbot https://interworky.gitbook.io/interworky/getting-started/lm-studio-self-hosted-llm#running-locally-note-for-locally-hosted-servers
View on Reddit #46376011

Desmack1@reddit

its great.. i use it all the time.
View on Reddit #38243722

ianwill93@reddit

[Anything LLM](https://anythingllm.com/) is pretty good for simple plug n play (it's an Ollama wrapper). I switched from Jan (too much jank) and Anything LLM (the RAG was pretty bad) over to [Nvidia's ChatRTX](https://www.nvidia.com/en-us/ai-on-rtx/chatrtx/). If you have an Nvidia card, it's the most performant option on the platform. It's also probably the only one of the many listed on this page that's actually a Windows-focused app and not an afterthought port.
View on Reddit #38237106

No-Detective-5352@reddit

Are there alternatives that also have a Python code interpreter built in, that can execute code and include it in the chat? It would be great to have an offline alternative to the ChatGPT data analyst.
View on Reddit #38228985

ies7@reddit

I use open webui and continue.dev for ollama.   Continue is a vscode extension so it's  directly in your IDE.   Open webui has artifact (like claude artifact or chatgpt canvas) for html+js. For python you can install a function (right now it's top no 2 function)
View on Reddit #38237082

MoreFoxBeans@reddit

I use Msty, it's the best I could find.
View on Reddit #38233778

Future_Might_8194@reddit

https://preview.redd.it/obmi6knnolvd1.jpeg?width=1920&format=pjpg&auto=webp&s=5b088c18ad8b63dae509fb9afaea9e6a4fcf54b8 Mine's better.
View on Reddit #38233293

vaksninus@reddit

Depends on what you want to do. If you want to run a local server your services can use, I much prefer Ollama. LM studio is okay (especially how you download new models, I quite like and changing settings is pretty easy), but feels quite slow as a server and often I just want to close and my server if it crashes and it's so much faster in ollama than going to the server option in LM studio.
View on Reddit #38231554

unlikely_ending@reddit

It's extremely good
View on Reddit #38221206

smshrimant@reddit

Yes, you can also try GTP4All
View on Reddit #38216135

NextTo11@reddit

It's really good, but there are concerns about data security and privacy due to unaudited closed source code. It's probably okay, but who knows. Don't feed it anything unless you don't mind 3rd parties reading and exploiting your data.
View on Reddit #38214944

arthurtully@reddit

I use msty, it has api for groq and all local llms work fine
View on Reddit #38212409

Different-Effect-724@reddit

Try this: [https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file)
View on Reddit #38207749

AccidentAnnual@reddit

[Pinokio](https://pinokio.computer/) has many AI apps including LLMs, installs take one click. [Open WebUI](https://pinokio.computer/item?uri=https://github.com/pinokiofactory/open-webui) in Pinokio is a pretty amazing Ollama frontend. It has many features and options, plus an online [repository](https://openwebui.com/#open-webui-community) with plugins and documentation and such. https://preview.redd.it/dfhrfz4efjvd1.png?width=2935&format=png&auto=webp&s=f6fdbe9359bea05638ef4a135b445cef64e4210a
View on Reddit #38205500

shyam667@reddit

Backyard.ai might be a good choice if u want backend as well as better frontend all in one.
View on Reddit #38201984

AyraWinla@reddit

I used to in my "exploration phase". I ended up with Kobold.cpp since for me that's what ran the fastest (using an AMD Laptop without a GPU), and that it has no installation or anything required: just a single file. Unless development gets dropped at some point, I doubt I'll ever switch to anything else.
View on Reddit #38200581

RandumbRedditor1000@reddit

Yes it is good. And yes there are alternatives. I used to use LM studio before I switched to ollama+open webui
View on Reddit #38197494

RealBiggly@reddit

Consider Backyard for role-play. ChatGPT4all was good last time I looked. LM Studio uses to run quite slow but it's improved. It can run as a backend for other front-ends if you want to experiment and can't get your head around Ollama.
View on Reddit #38196238

Revolutionary_Put475@reddit

[https://jan.ai/](https://jan.ai/) I find Jan better, faster than all the rest
View on Reddit #38194198

Arkonias@reddit

It's great. It just works, super easy to get up and started. Has the nicest-looking UI out of all the competitors.
View on Reddit #38188269

Weary_Long3409@reddit

Yes. It's a good start.
View on Reddit #38187662

Some_Endian_FP17@reddit

Learn to run llama.cpp first before you try these other packages.
View on Reddit #38187104

-Hello2World@reddit

LM Studio is a good one...
View on Reddit #38186113

utf80@reddit

Yes. Yes. https://www.nomic.ai/gpt4all Or https://jan.ai/ Both are up to date and provide a good service to run the model of your choice.
View on Reddit #38183692

SAPPHIR3ROS3@reddit

Yes it’s good, however if you want something open source your best bet is ollama and if want a good interface my advice is t o pair it with open webui
View on Reddit #38181202