First impressions and thoughts on the GTR9 Pro (Beelink's 395)

Posted by kmouratidis@reddit | LocalLLaMA | View on Reddit | 30 comments

tl;dr: Good and bad, some "benchmarks" and details here. Not sure I'd recommend it. Not yet.

Hey y'all! Just like many others I wanted to try the 395, but since I mostly wanted it as a server first (and LLM runner third), I wanted one with 10 Gbps networking. The MS-S1 hadn't come out yet, so I went with the Beelink GTR9 Pro AMD Ryzen™ AI Max+ 395, and \~25 days later it's here.

I tried the preinstalled Windows, which functioned for a bit, quickly devolved into a mess that made me want to return it. Thankfully, I wanted it as a server, which means I'll be running Linux, but I had to test it. Plenty of crashes under load, the Intel network card not working, and other weirdness. Turns out there are plenty of known issues that may be hardware or driver related, plenty of posts and speculation in r/BeelinkOfficial and it has been going for a couple weeks it seems, and may also affect Linux, but oh well, time to move on.

People suggest you use Fedora or Debian Sid, or anything with a recent kernel, and that's probably good advice for most people, but I ain't running Fedora for my server. I used a heavily configured DietPi (so basically Debian) instead, for no other reason than consistency with the rest of my (actually mini*) servers. Surely the driver situation can't be that bad, right? Actually yes, it's perfectly fine to run Debian and I haven't had an issue yet, although it's early, let's see if it reach even 10% the uptime my TrueNAS server has. After troubleshooting a few issues, installing the (hopefully) correct drivers, and building llama.cpp (lemonade and vLLM will have to wait until the weekend), I quickly tested a bunch of models, and the results I'm getting seem to roughly align with what others are getting (1, 2, 3, 4). I have documented everything in the gist (I think!).

Out of the box, the Beelink runs with 96GB allocated as VRAM and can consume up to 170W without me messing with BIOS or Linux settings. In short, the results are exactly as you would expect:

GPT-OSS-120B is probably the best model to run
Flash Attention helps, but not always by a lot
Performance mode didn't do a thing and maybe was worse, graphics overclocking seems to help a bit with prefill/pp/input, but not a low
ECO still consumes 100W during inference, but the performance hit can be as little \~15% for \~45% less max power, which is kinda insane but well-known by now that max power only gives marginal improvements
You must be dense if you expect to run dense models

Model	Size	Params	Backend	Test	Tokens/s (FA 0)	Tokens/s (FA 1)
GLM-4.5-Air (Q4_K_XL)	68.01 GiB	110.47B	ROCm	pp512	142.90 ± 1.39	152.65 ± 1.49
				tg128	20.31 ± 0.07	20.83 ± 0.12
Qwen3-30B (Q4_K_XL)	16.49 GiB	30.53B	ROCm	pp512	496.63 ± 11.29	503.25 ± 6.42
				tg128	63.26 ± 0.28	64.43 ± 0.71
GPT-OSS-120B (F16)	60.87 GiB	116.83B	ROCm	pp512	636.25 ± 5.49	732.70 ± 5.99
				tg128	34.44 ± 0.01	34.60 ± 0.07

Happy to run tests / benchmarks or answer questions, but some stuff may need to wait for the weekend.

----------

* Bonus: I sent this photo of the Beelink with my old Minisforum Z83-F to someone, joking about how mini PCs looked in 2015 vs in 2025. She thought the Minisforum was the one from 2025.

[Beelink GTR9 Pro (2025) dwarfs it's little bro, the Minisforum Z83-F (2015)](

[-]

spaceman3000@reddit

How's the noise? I wanted evo but heard it's too loud and thermals are bad.

[-]

kmouratidis@reddit (OP)

Zero, or at least not noticeable over my PC on idle. I stick my ear to the back of the device and I can't hear a thing.

But maybe it's a bad thing? Maybe the networking keeps crashing under load because it's a thermal issue? Not sure yet.

[-]

spaceman3000@reddit

Weird. Even under workload? I have 370 AI (weaker version of this) and it can be loud. Pc is very similar form factor and I use it with 9070xt over oculink. I can't hear the card at all but minipc is audible.

[-]

kmouratidis@reddit (OP)

Yeah, 180W total and no noticeable noise. No idea why, maybe as I said that could be part of the problem.

[-]

spaceman3000@reddit

Or they did very good cooling. Monitor your temp sensors and check if they are OK.

[-]

kmouratidis@reddit (OP)

They seem to be okay, and I played around with BIOS settings and mild undervolting (CPU & GPU) and even under high load (>200W total) it was quiet because I had set the curves so that they go to 100% only when the CPU hits 85C, which was probably never. Then I reduced the max CPU temperature to 80 and the 75 and it became literally impossible, and I also forgot to tune the fan curves. Now I've set tjmax to 75 and the fans to go to 100% at the same temp, and yes, they are very audible when they go to 100%, even with noise cancelling headphones.

But hey, even though the CPU is at 75C and having a hard time because I'm hitting it with unnecessary artificial stress, the GPU temps dropped from 65-66C to ~58-61C. Had to seriously mess around with various settings, but at least now it seems to be stable even under lots of stress and with 180-190W sustained.

[-]

spaceman3000@reddit

Damn then it's a no go for me. I have to keep it in living room. I guess due to thermals it's not possible to make it quiet and small at the same ti.e.

Framework would be best because it has 120mm noctua so it should be super quiet but again it's too big lol

[-]

knekker2@reddit

No sure what you are on about. The GRT9 is the quiest of them all along these 395 NUC's.

[-]

spaceman3000@reddit

I doubt it’s quieter than framework with 120mm Noctua.

[-]

Torgshop86@reddit

There is a firmware issue with the NIC leading to crashes on windows and Linux. See here https://craigwilson.blog/post/2025/2025-09-25-beelink395bsod/#the-first-bsods

I don‘t use the NIC myself, since my Display has an ethernet port I use via usb-c. But I read that switching to usb-c ethernet or wifi and deactivating the NIC in bios solves the issue until hopefully an update is provided

[-]

dabiggmoe2@reddit

"You must be dense if you expect to run dense models" haha

[-]

johannes_bertens@reddit

Awesome to read and nice to see your setup! My AMD 395+ is on it's way.

Can you benchmark the IBM Granite 4.0 "small" model in a decent quant? It's the first I'm going to try myself.

[-]

kmouratidis@reddit (OP)

Sure, give me a link/quant and a command or other options and I'll give it a go!

[-]

johannes_bertens@reddit

I'd go for this: https://docs.unsloth.ai/models/ibm-granite-4.0 GGUF: https://huggingface.co/unsloth/granite-4.0-h-small-GGUF

And just run a quant that fits? I've done 6 or 4 bit myself for other models but think the 8bit one might be an easy fit for 96gb as well?

I'm mostly interested in BIG context requests. I've done a few single 40k token prompts to compare with non mamba-models.

[-]

kmouratidis@reddit (OP)

Sorry it took a while to get back to you, I've been having the same issues with Intel networking just like lots of others and now trying to troubleshoot some other power issues with Beelink and installed Windows again. I download Q6_K from lmstudio-community to run it on Windows, I'm using a 56K prompt, FA, 64k content size, all other settings to their defaults, and: * Eco mode (100-105W) + Vulkan: 210 t/s input, 16.9 t/s output (138 tokens) * Eco mode (100-105W) + ROCm: 122 t/s input, 18.1 t/s output (201) * Balanced/Performance mode: too unstable

[-]

johannes_bertens@reddit

Thank you! Good luck with the power issues! Hope they get resolved soon!

[-]

Diao_nasing@reddit

pp is too slow for glm air. why?

[-]

kmouratidis@reddit (OP)

Big model with lots of active parameters? If you see the sources I've attached for other people doing benchmarks, they seem to be in the same ballpark.

[-]

deulamco@reddit

DietPI ? Lovely !

Finally, someone also praise on those compact distro for real daily usage.

[-]

jwpbe@reddit

if you're still experimenting, try throwing cachyos on it? fuck it, try an arch distro. install paru (arch user repository helper) and see how much extra performance you get.

i have non cachyos arch installed on a few of my machines and switched over to their kernels and package repositories because i get extra performance out of vs base arch.

[-]

kmouratidis@reddit (OP)

I want this for a server that's meant to be stable(-ish), not sure arch-based distros are the way. Plus I'm not an arch of the rolling release stuff, I've tried Manjaro ~4 years ago (as the noob-friendly choice of the time) but didn't like it.

[-]

Rich_Repeat_22@reddit

Experiment with DOWNVOLTING and negative curves. AMD is weird on that front, because temperatures reduce, clocks go higher!

Some did it using software on GMK X2 and got 15%. Beelink has fully unlocked bios settings so you can go more.

[-]

kmouratidis@reddit (OP)

That's a good point. Not sure if I'll keep it (meant to be a "stable"-ish server), not to mention that I suck at this, but I should at least try it just to see how it works. And yes, the BIOS does seems fully unlocked, I don't think I've ever seen so many options available.

[-]

Rich_Repeat_22@reddit

On the Beelink forums there are some settings from people how to deactivate things which aren't needed to massively lower the overall system power consumption.

[-]

kmouratidis@reddit (OP)

Thanks for the pointer! Is it this thread? For disabling devices (SD card reader, WiFi, ...)?

[-]

darth_chewbacca@reddit

Surely the driver situation can't be that bad, right? Actually yes, it's perfectly fine to run Debian and I haven't had an issue yet

Could you be more clear. Is debian bug free so far on the machine?

[-]

kmouratidis@reddit (OP)

Not sure about "bug free", but as I said in the passage you quote,

it's perfectly fine to run Debian and I haven't had an issue yet

The only "issue" was the missing Intel network card drivers, but I installed them manually without a problem. It was easier than building llamacpp, just a single make install and modprobe ixgbe (or reboot).

I only used it for ~3 hours today, but no crashes at all (Windows crashed after only 3-4 prompts usually). I even run the benchmark & stress scripts that dietpi provides and had no issue, but more stress testing will follow.

[-]

darth_chewbacca@reddit

The only "issue" was the missing Intel network card drivers

Well that's a bummer. Network drivers can be painful if you have to download the driver source and don't have network connection on the machine to do so. I had a recent experience with this on opnsense and an ms-a1. Luckily I have a USB Ethernet dongle that was supported, but yeah, lack of network drivers can be teeth grinding.

[-]

kmouratidis@reddit (OP)

Not gonna lie, that was one of my first thoughts too (the other being that maybe I fried the network card or something), but when I saw the wifi was working I realized there were many ways to go about it, it's not <2015 anymore :D

For example you can easily use a USB drive to transfer them, a wifi hotspot from your phone, a USB connection and file transfer from your phone, etc. And you already need a USB drive to install a different OS, no? Plus, it might be a dietpi issue, or debian issue, or my-version-of-those issue. Maybe that's why people suggest the latest Fedora OS?

[-]

Pro-editor-1105@reddit

Looks painfully similar to a mac studio