Anybody using LMStudio on an AMD Strix 395 AI Max (128GB unified memory)? I keep on getting errors and it always loads to RAM.

Posted by StartupTim@reddit | LocalLLaMA | View on Reddit | 12 comments

Hey all, I have a Framework AI Max+ AMD 395 Strix system, the one with 128GB of unified RAM that can have a huge chunk dedicated towards its GPU. I'm trying to use LMStudio but I can't get it to work at all and I feel as if it is user error. My issue is two-fold. First, all models appear to load into RAM. For example, a Qwen3 model that is 70GB will load into RAM and then try to load to GPU and fail. If I type something into the chat, it fails. I have the latest LMStudio, and the latest llama.cpp main branch that is included with LMStudio. I also set GPU max layers for the model. I have set 96GB vram in the bios, but also set it to auto. Nothing works. Is there something I am missing here or a tutorial or something you could point me to? Thanks!