Is LLM Studio good? | TheaterFire

[-]

_underlines_@reddit

A lot of alternatives: https://github.com/underlines/awesome-ml/blob/master/llm-tools.md#native-guis

Reply

[-]

macheteBlade@reddit

this has a great compilation of tools, but hasn't been updated it recently, do you know other sources like this that are more up-to-date?

Reply

[-]

Lm Studio is very good. I have used it since its inception, and although I do not really agree with the new changes that were made to its interface, it is still an excellent option. What I really don't understand is how someone can use Ollama. Look, I have analyzed it from all points of view and I don't understand why someone would use a program like that.

Reply

[-]

GimmePanties@reddit

Because when ollama is set up it will run in the background and launch on startup and you can just assume it’s there when your client applications need to call it.

Reply

[-]

Flimsy-Tonight-6050@reddit

can't I make lm studio launch from start aswell?

Reply

[-]

Impossible-Value5126@reddit

Run msconfig in search and add it to startup.

Reply

[-]

GimmePanties@reddit

Probably? Look, don’t get me wrong, I love LMStudio and it is my main drive for downloading and trying new models and sometimes I’ll start it server as a quick endpoint for an app. But for stuff I want available all the time, I get set up on Ollama. It seems really smart about being able to load and unload models on demand, so with LM you’re loading the model yourself when you need it, and ejecting it when you’re done to manage memory. And if your app was connected to it, and expecting a different model, it would fail until you go into LM and manually load it.

Reply

[-]

ImaginaryRea1ity@reddit

Can you run ollama on your PC or a mac and then access it on your phone on a local network?

Reply

[-]

GimmePanties@reddit

Yes

Reply

[-]

ramplank@reddit

Depends on your use case, I use it to spin up models and then I connect to the models with other tools and in that scenario I dont need all the fluff of lmstudio. sure I can do that with lmstudio but why have all the extra fluff when I'm up and running with 1 terminal command

Reply

[-]

muxxington@reddit

The answer is simple: People don't want all-in-one software. Especially companies, teams, groups etc. don't want all-in-one software. They want to host one more models and expose them via an OpenAI compatible API and then connect clients of all kind to it instead of having a full blown gaming PC under each desk.

Reply

[-]

pablogabrieldias@reddit

But Lm Studio is free. It is not open-source, but it is totally free

Reply

[-]

muxxington@reddit

No. It is not even close to free. It is *for* free. At least by now and only for private non-commercial use.

Reply

[-]

Eugr@reddit

Well, there are several reasons: 1. Ollama runs as a server and you can connect different GUIs and tools, such as excellent Open-WebUI. 2. Open-source vs closed source. 3. Licensing. Ollama has MIT license, LM Studio requires you to contact them if you want to use it for work purposes.

Reply

[-]

first2wood@reddit

Yes, I guess it's because that's the most famous open source one because it created early. Even a lot of developers integrate ollama into their projects first. I used ollama first because it's popular. Then I tried others and settled down with LMS.

Reply

[-]

constPxl@reddit

(not proven personally and i dont have the data but) could it be that the "interface-less" console base ollama uses less system resources than lm studio? dont get me wrong, i love lm studio showing me all the settings and configs running the model guess what im saying is if i know what im doing, i'd use the least resources required app to run the model so those resources can be allocated elsewhere (tts, stt, rag etc)

Reply

[-]

Ill-Total9416@reddit

That’s quite a lot, for example, LM Studio, Ollama, GPUStack, all based on llama.cpp.

Reply

[-]

cantgetthistowork@reddit

Wish LMS would work properly with multiple GPUs

Reply

[-]

Atari-Katana@reddit

I just wish I could afford multiple GPUs these days.

Reply

[-]

visionsmemories@reddit

yes, it is yes, there is

Reply

[-]

bearbarebere@reddit

I recommend trying LM studio before all others because it runs GGUFs faster than any other program I've seen (tied with ollama but ollama is annoying to set up if you're not techy). so you'll know how fast your computer can run them when it's set up right. I started with Oobabooga and just accepted the speeds of GGUFs were 4-9t/s. Nope. LM studio gets them up to **40t/s**.

Reply

[-]

road-runn3r@reddit

>it runs GGUFs faster than any other program I've seen Not in my experience. Kobolt can be sometimes up to 15÷ faster. (on a 3060ti)

Reply

[-]

bearbarebere@reddit

I tried kobold and I vaguely remember it being lackluster. So strange.

Reply

[-]

entangledloops@reddit

Comparing LLM studio and Kobold is a fool's errand. You don't know what code is being used. One may be 15% faster, but still 80% slower than possible. You need to load the model yourself with barebones code to get the optimal estimate. Not claiming that's an easy task if you aren't an ML engineer, but it's the only correct way at this time.

Reply

[-]

road-runn3r@reddit

Quite the opposite but to be honest there are a few things that you need to set up or check with Kobold. First, set GPU layers to -1 so it shows max layers then set it manually. (according to your VRAM) Go to tokens and enable flash attention. Did you get the cuda12 exe version? (if you have a newer Nvidia GPU you should) ps. don't mind the downvotes, people thought it was an opinion when you are just trying to learn

Reply

[-]

CoyRogers@reddit

They might mean CPU only, in my CPU only laptop it runs way way faster then ooogabooogs does, a ton faster. Loads models instantly rather then taking forty seconds and responds so much faster

Reply

[-]

road-runn3r@reddit

I don't think he's getting "40t/s" on CPU.

Reply

[-]

mamba436@reddit

I agree, LM studio was faster than other like msty imo. I run on very high end hardware and my best performance was up to 110 token/s for a model on gpu (but was halved on others). Now I know these programs are frontends product so I am not saying that there are not capable of same performance. But out of the box, LM studio allowed me a faster result.

Reply

[-]

d3ftcat@reddit

Used to be a pain. If you haven't seen it, you can now add most hugging face models straight from their page to Ollama. Still prefer LM Studio over Ollama, but Msty above it.

Reply

[-]

bearbarebere@reddit

I have a bunch of models downloaded already and making a model file is freaking annoying compared to every other app’s “tell me what folder to use and then click a button to load the model”

Reply

[-]

d3ftcat@reddit

Msty is this one https://msty.app/ not affiliated in any way.

Reply

[-]

SpareIcy8308@reddit

the only downside i´ve found of msty is it´s knowledge stacks...doesn´t work right most of the times.

Reply

[-]

Deluded-1b-gguf@reddit

How’s that the case? Did you forget to offload layers to the gpu?

Reply

[-]

bearbarebere@reddit

No, definitely not. I offloaded all 33 layers of the model I was trying in oobabooga, and all 33 in Lm studio, and used the exact same model file - not even a copy, I literally used the same file. I believe that oobabooga installs llama.cpp differently than with all the optimizations that Lm studio does. It’s the only thing that makes sense.

Reply

[-]

Deluded-1b-gguf@reddit

Hmm interesting I need to do a test myself I’m curious now

Reply

[-]

bearbarebere@reddit

I would love to see your results, perhaps my setup is just weird!

Reply

[-]

Zangwuz@reddit

Or maybe you are comparing two different values. When i was using ooba before, it was giving t/s based on the total time which include prompt processing while lm studio give the t/s based on the generation time only which doesn't include the prompt processing. That said, i noticed that [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) was a bit slower than normal llama.cpp but clearly not 4x less.

Reply

[-]

bearbarebere@reddit

No. I’m talking about both cases where there’s no prompt to process other than “write a giant paragraph” AND cases where there’s a long context. Both do the same.

Reply

[-]

Zangwuz@reddit

I think you don't get what i'm saying. Ooba display on the last line something like "Output generated in 3.63 seconds (23.99 tokens/s, 87 tokens, context 1050, seed 593086777)" if you compare this value to the one displayed on LM studio UI it's wrong. You need to check eval time on the console for both something like "eval time = 1371.41 ms / 87 runs ( 15.76 ms per token, 63.44 tokens per second)" And i just tested to confirm if it was still the case and yes it does still display the total. Btw just compared by curiosity and the performance between both is actually close, even closer than i thought it would be.

Reply

[-]

bearbarebere@reddit

I didn’t want to do this, but I’m going to get up right now and show you because clearly you think I’m stupid. Brb.

Reply

[-]

Zangwuz@reddit

You don't need too but this is not a normal behavior either, LM studio is not 4x faster than ooba so i was just trying to find a rational explanation.

Reply

[-]

bearbarebere@reddit

You are actually correct!! Holy shit, I've never been more blatantly wrong in my life. I should check my condescension because that was absolutely not warranted. I greatly appreciate you. After running about 5 tests each, the difference is about 1.25x, so a 1 minute response in Ooba would only take about 48 seconds in LM studio. That is faster, but nowhere near as fast as the token values implied. I was wrong about everything, including being able to truly tell the difference between token speeds without actually measuring it. I gave it a long prompt, turned the temp to 0, and then told it to just repeat the prompt, and it did, and then I timed that. Really neat, again I was COMPLETELY wrong, I WAS stupid, and you were 100% right and I love you. Forgive me? :')

Reply

[-]

Zangwuz@reddit

Yeah no problem at all, it's rare to see someone saying i was wrong in reddit and you were open enough to try even though you were skeptical :)

Reply

[-]

bearbarebere@reddit

Bro you don’t even know. I was lowkey so mad. I was like “HOW DARE THEY THINK THAT THE HOURS I SPENT ON THIS WERE WRONG”, I just didn’t want to accept it LMAO But I try (keyword try lol) to be a good scientist. And I’m actually really happy that you corrected me. I was so mean I need to work on that 💀 I took it like a personal attack because I spent hours and hours trying to figure it out and never did. seriously thanks lol

Reply

[-]

noneabove1182@reddit

I think ooba installs the llamacpp python wrapper (I could be wrong) and also often lags behind on releases, so that could also play a part

Reply

[-]

a_beautiful_rhind@reddit

I would use kobold CPP in windows if I didn't want to mess with anything.

Reply

[-]

Elgamer_795@reddit

memory error with amd rx 6xxx series.

Reply

[-]

Pro-editor-1105@reddit

Jan.

Reply

[-]

Pro-editor-1105@reddit

am i being downvoted by lmstudio devs?

Reply

[-]

mamba436@reddit

I did not 'downvote' your comment, but I will take the liberty of doing so for this one since it implies a distasteful conspiracy theory and is a comment one might expect to hear at a café counter. :)

Reply

[-]

Awario_time@reddit

maybe you are downvoted by lmstudio, ollama, kobold and others 😅 Sorry!

Reply

[-]

TaggM@reddit

Are there any particular features that would be ideal for you? LM Studio is a gentle entry into fetching and running LLMs locally with your own docs for more recent and specialized information. It's fine for casual use. But it loses its lustre when more intensively with agentic calls and multiple sources from a large library of proprietary documents. You may want to try GPT4ALL, AnyLLM, or OpenWeb UI, but each program has its own strengths and quirks.

Reply

[-]

Orlando_orchids@reddit

Surprised that nobody mentioned [https://msty.app/](https://msty.app/) yet. I started with LM Studio, also tried [jan.ai](http://jan.ai) and settled on Msty.

Reply

[-]

BeYourBestVersion@reddit

I've been using msty for a few months and have generally very good experience with it. Question though: For running local models on laptop without GPU, would any other packages allow for better optimization of these models than msty could?

Reply

[-]

Longjumping_Ad5434@reddit

Funny, I went jan then msty, and have landed on LM Studio, especially after the mlx support.

Reply

[-]

first2wood@reddit

Tried. It's running ollama for local LLM. I hope they can run llama.cpp instead of ollama. Ollama changes model file extension to none. It's too annoying. I don't know why they do it.

Reply

[-]

circlesqrd@reddit

Seconding Msty. I run one on my desktop, while connecting another Msty from my laptope. Shared workspaces are nice too, including RAG, and the detailed analytics.

Reply

[-]

Pristine_Income9554@reddit

tabbyAPI, koboldcpp beckend, ST fronted

Reply

[-]

maxidev0x@reddit

What's ST frontend?

Reply

[-]

Historical_Scholar35@reddit

Silly tavern

Reply

[-]

Possible-Basis-6623@reddit

Their site looks so......hahas I prefer streamlit but it needs all that accounts paid plan which is annoying, I hate everything nowadays trying to build extra ecosystems, can everything just plug-in-play? you want A, they need you to install B, C, D, E, F....

Reply

[-]

Historical_Scholar35@reddit

You can install it via Pinokio https://pinokio.computer/

Reply

[-]

maxidev0x@reddit

Thanks 👍🏼

Reply

[-]

noneabove1182@reddit

While these are good, they fill an extremely different niche Where lm studio shines is the single install executable and you're up and running like a normal Windows/Mac application, no mess no fuss no barrier of entry Is it the best? No probably not, but the fact that it's extremely good while also being extremely easy makes it very valuable

Reply

[-]

Pristine_Income9554@reddit

Read question of the author.

Reply

[-]

noneabove1182@reddit

Yeah sure I'm just pointing it out for others, and if they started with LM Studio and they're on windows, they're *likely* not the kind of user to figure out these more difficult tools, as great as they are they're just too different and too difficult for the average user No offense to anyone who uses lm studio of course or who is an "average user"

Reply

[-]

Salt-Bread4114@reddit

Yeah I think it’s pretty good I added a tutorial to run LM Studio local models on any chatbot https://interworky.gitbook.io/interworky/getting-started/lm-studio-self-hosted-llm#running-locally-note-for-locally-hosted-servers

Reply

[-]

Desmack1@reddit

its great.. i use it all the time.

Reply

[-]

ianwill93@reddit

[Anything LLM](https://anythingllm.com/) is pretty good for simple plug n play (it's an Ollama wrapper). I switched from Jan (too much jank) and Anything LLM (the RAG was pretty bad) over to [Nvidia's ChatRTX](https://www.nvidia.com/en-us/ai-on-rtx/chatrtx/). If you have an Nvidia card, it's the most performant option on the platform. It's also probably the only one of the many listed on this page that's actually a Windows-focused app and not an afterthought port.

Reply

[-]

No-Detective-5352@reddit

Are there alternatives that also have a Python code interpreter built in, that can execute code and include it in the chat? It would be great to have an offline alternative to the ChatGPT data analyst.

Reply

[-]

ies7@reddit

I use open webui and continue.dev for ollama. Continue is a vscode extension so it's directly in your IDE. Open webui has artifact (like claude artifact or chatgpt canvas) for html+js. For python you can install a function (right now it's top no 2 function)

Reply

[-]

MoreFoxBeans@reddit

I use Msty, it's the best I could find.

Reply

[-]

Future_Might_8194@reddit

https://preview.redd.it/obmi6knnolvd1.jpeg?width=1920&format=pjpg&auto=webp&s=5b088c18ad8b63dae509fb9afaea9e6a4fcf54b8 Mine's better.

Reply

[-]

vaksninus@reddit

Depends on what you want to do. If you want to run a local server your services can use, I much prefer Ollama. LM studio is okay (especially how you download new models, I quite like and changing settings is pretty easy), but feels quite slow as a server and often I just want to close and my server if it crashes and it's so much faster in ollama than going to the server option in LM studio.

Reply

[-]

unlikely_ending@reddit

It's extremely good

Reply

[-]

smshrimant@reddit

Yes, you can also try GTP4All

Reply

[-]

NextTo11@reddit

It's really good, but there are concerns about data security and privacy due to unaudited closed source code. It's probably okay, but who knows. Don't feed it anything unless you don't mind 3rd parties reading and exploiting your data.

Reply

[-]

arthurtully@reddit

I use msty, it has api for groq and all local llms work fine

Reply

[-]

Different-Effect-724@reddit

Try this: [https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file)

Reply

[-]

AccidentAnnual@reddit

[Pinokio](https://pinokio.computer/) has many AI apps including LLMs, installs take one click. [Open WebUI](https://pinokio.computer/item?uri=https://github.com/pinokiofactory/open-webui) in Pinokio is a pretty amazing Ollama frontend. It has many features and options, plus an online [repository](https://openwebui.com/#open-webui-community) with plugins and documentation and such. https://preview.redd.it/dfhrfz4efjvd1.png?width=2935&format=png&auto=webp&s=f6fdbe9359bea05638ef4a135b445cef64e4210a

Reply

[-]

shyam667@reddit

Backyard.ai might be a good choice if u want backend as well as better frontend all in one.

Reply

[-]

AyraWinla@reddit

I used to in my "exploration phase". I ended up with Kobold.cpp since for me that's what ran the fastest (using an AMD Laptop without a GPU), and that it has no installation or anything required: just a single file. Unless development gets dropped at some point, I doubt I'll ever switch to anything else.

Reply

[-]

RandumbRedditor1000@reddit

Yes it is good. And yes there are alternatives. I used to use LM studio before I switched to ollama+open webui

Reply

[-]

RealBiggly@reddit

Consider Backyard for role-play. ChatGPT4all was good last time I looked. LM Studio uses to run quite slow but it's improved. It can run as a backend for other front-ends if you want to experiment and can't get your head around Ollama.

Reply

[-]

Revolutionary_Put475@reddit

[https://jan.ai/](https://jan.ai/) I find Jan better, faster than all the rest

Reply

[-]

Arkonias@reddit

It's great. It just works, super easy to get up and started. Has the nicest-looking UI out of all the competitors.

Reply

[-]

Weary_Long3409@reddit

Yes. It's a good start.

Reply

[-]

Some_Endian_FP17@reddit

Learn to run llama.cpp first before you try these other packages.

Reply

[-]

-Hello2World@reddit

LM Studio is a good one...

Reply

[-]

utf80@reddit

Yes. Yes. https://www.nomic.ai/gpt4all Or https://jan.ai/ Both are up to date and provide a good service to run the model of your choice.

Reply

[-]

SAPPHIR3ROS3@reddit

Yes it’s good, however if you want something open source your best bet is ollama and if want a good interface my advice is t o pair it with open webui

Reply

Reply to Post

91 Comments