What would be the best OS to run LLMs?
Posted by Manaberryio@reddit | LocalLLaMA | View on Reddit | 26 comments
Hi there,
I've ordered a mini PC with 128GB of RAM and the AMD AI Max 395. I intend to use it with Proxmox (like my actual machine), where I run Windows for some gaming and macOS for my music library server. I also want to run LLMs on it.
Main purpose would be local agent coding and some text refining. I'm quite new and it's quite overwhelming to be honest. It evolves so fast I can't keep track of what works best.
- What would be the best OS for LLMs?
- What would be the best software to run LLMs?
- Any compabitility issues with my choices to be aware of (such as graphic drivers on linux)?
Thank you for your help!
DelKarasique@reddit
Linux + vllm for maximum performance.
Windows + llama.cpp for ease of use.
bernzyman@reddit
Does vllm in Linux still need more vram than Llamacpp as is the case in windows?
XccesSv2@reddit
Its not about about Windows vs Linux, its because vllm handles LLMs differently. Its uses more VRAM but has better throughput
bernzyman@reddit
Yes I know that part. I simply wondered whether it ran more efficiently on a LInux setup compared to a Windows setup
Th3Sim0n@reddit
Windows + LM studio for even easier use for console haters
DelKarasique@reddit
llama.cpp > llama.cpp wrappers IMO
Th3Sim0n@reddit
Fully agreed, but for clickops LM Studio is just way friendlier
jwpbe@reddit
half of the replies will be bots telling you old advice
find a distribution of arch linux like cachyos or endeavour that is user friendly and use that so you get rolling releases
VoiceApprehensive893@reddit
linux distro war thread
i use cachyos btw
Edenar@reddit
i have a framework desktop (128GB/395 max) : i first installed Ubuntu but i recently switched to fedora (native podman, more stable at least coming from Ubuntu 25.10).
i wouldn't use windows for llm. Also unless you to play some esport game with kernel level anticheat (LoL, valorant,..) , gaming works well (steam require 0 efforts, i used herouc launcher for games from GOG and epyc and it was almost 0 efforts too)
VoiceApprehensive893@reddit
gaming on linux still requires effort
NNN_Throwaway2@reddit
I use windows with WSL and docker for vLLM. I also have a dual-boot Ubuntu install but I just don't have any reason to use it.
lemondrops9@reddit
Linux Mint + LM Studio for an easier setup then move to Llama.cpp for some extra speed.
Thunderstarer@reddit
I run my LLMs in NixOS LXCs. Ubuntu would probably be best if you're not already familiar with Nix.
Evening_Ad6637@reddit
In my personal experience the best OSes to run LLMs and all are Debian, OpenSUSE and Artix
turtleisinnocent@reddit
TempleOS, of course. Holy C makes it fast because it all runs on ring 0.
Evening_Ad6637@reddit
:D
ImportancePitiful795@reddit
IF you use W11 IOT Enterprise with Lemonade server (llama.cpp wrapper with FastFlowML etc added to it), there is absolutely no need need to switch to Linux for the few % extra perf. Just stick to the Windows, play your games, run your Windows application. No need to switch OS.
If you play BF/COD games, also stick to Windows. There is no Linux DRM for those games so they become unplayable. Same applies to all EA games using EA AntiCheat (EAAC). (I refuse to play any EA games even on Windows).
Otherwise Linux with Lemonade or vLLM depending your needs. vLLM is better if you run agents due to better concurrency performance.
Which distro? Depends. Fedora is great for workstation usage, but if you plan to run LLM as services, or God forbid try to setup remote desktop to it, better use Ubuntu...
Unfortunately nobody in here can give you a definite answer if AMD adds MLX support on the Windows Lemonade or only on Linux Lemonade. (currently AMD MLX support is in close beta testing by the Lemonade team).
jikilan_@reddit
If pure llm then linux, if gaming then windows especially streaming with moonlight
jikilan_@reddit
Windows sometimes the power profile won’t go down.
XccesSv2@reddit
If you need official guides then AMD has natively Ubuntu in their guides. Thats a good start. But in my case, I used a few months Fedora because it has ROCm in his Repos integrated but now I switched to CachyOS because their repos are even more actual. They already have ROCm 7.2.2 official in their repos.
BUT: It doesnt really matter. Instead of installing everything natively you can also use toolboxes and docker container and can use what every distro you want to get vllm or llama.cpp running.
You can also install proxmox with a LXC container and passthrugh the GPU/NPU devices for an isolated LLM instance
RG_Fusion@reddit
The best OS for running LLMs would be a Debian install of Linux, but if you're already feeling overwhelmed you should stick to Windows. You can always make the change at a later date when you're feeling comfortable. The performance loss is notable, but not game-changing.
What I operate on and view as an idealized system is running the LLMs on a Linux server dedicated for inference. The server just accepts and responds to requests from other computers. All of my python scripts that utilize LLMs are on my gaming PC, and they interact with the LLM over the local network.
Fine_Nectarine9328@reddit
Idk know about os, in basic task linux works better, but software for best performance is llama.cpp no doubt
DropInternational455@reddit
Remindme! 5 days
DropInternational455@reddit
Very good question, need answer 😅
RemindMeBot@reddit
I will be messaging you in 5 days on 2026-05-02 07:25:49 UTC to remind you of this link
CLICK THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)