[Benchmark] If you want protable StrixHalo - Here is my test for Asus ProArt Px13 and Qwen3.5 & Gemma4

Posted by Willing-Toe1942@reddit | LocalLLaMA | View on Reddit | 8 comments

I want powerhouse on the go and after some research and balancing option I went for Asus PX13 ProArt (GoPro edition) which is basically StrixHalo (AMD Ryzen AI 395+) with 128G RAM

This littel 13 inch laptop has amazin form factor all metal body and it's basically the lightest and most portable thing you can have to run LLM on the go

So I immeditly removed windows, installed CachyOS and started the benchmarks with 3 power mode (selected power modes from Gnome control center) and couldn't wait to share the result to the amazing community :D

here is the initaial Qwen3.5 benchmarks with noise level and measured temperature (nvtop and amdgpu_top)

[PX13 ProArt ](

## command run on llama-vulkan-radv toolbox

llama-bench -m Qwen3.5-35B-A3B-UD-IQ3_XXS.gguf -p 512,1024,2048,4096,8192,16384,32768 -t 512

application used for power monitor/temperature: amdgpu_top

noise measurement: with mobile phone - taken 30 cm away from laptop (similar distance your body to laptop)

Gemma4 benchmarks is baking right now will add it here later.

Power mode: Performance
Reported power consumption between 66 \~ 73 Watt
Reported temp (peak): 77 C
Fan noise measured 30 cm away: 47db

model	size	params	backend	ngl	threads	test	t/s
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp512	1007.05 ± 11.05
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp1024	972.53 ± 6.84
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp2048	938.87 ± 3.66
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp4096	901.94 ± 5.16
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp8192	870.25 ± 2.89
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp16384	784.83 ± 2.00
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp32768	644.06 ± 5.39
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	tg128	69.00 ± 0.28

Power mode: Balanced
Reported power consumption between 49 \~ 55 Watt
Reported temp (peak): 68 C
Fan noise measure 30 cm away: 39db

model	size	params	backend	ngl	threads	test	t/s
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp512	809.28 ± 14.25
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp1024	798.39 ± 4.99
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp2048	800.93 ± 2.92
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp4096	802.36 ± 4.62
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp8192	790.08 ± 4.04
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp16384	727.97 ± 2.63
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp32768	614.02 ± 1.22
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	tg128	68.67 ± 0.93

Power mode: Power saving
Reported power consumption between 38 - 40 Watt
Reported temp (peak): 62 C
Fan noise measure 30 cm away: 32db

model	size	params	backend	ngl	threads	test	t/s
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp512	725.47 ± 21.19
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp1024	727.55 ± 8.75
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp2048	707.59 ± 8.67
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp4096	673.13 ± 10.74
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp8192	610.91 ± 16.36
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp16384	488.11 ± 9.62
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	pp32768	407.35 ± 12.66
qwen35moe 35B.A3B IQ3_XXS - 3.0625 bpw	12.17 GiB	34.66 B	Vulkan	99	512	tg128	55.34 ± 0.13

[-]

Weird_Linux_Nerd_07@reddit

Can you test unloth's Qwen3.5-35B-A3B-UD-Q8_K_XL, Qwen3.5-122B-A10B-GGUF-UD-Q4_K_XL with ctx-size = 131072 (and 262144), flash-attn = on, kv-unified = true, cache-type-k = q8_0, cache-type-v = q8_0, chat-template-kwargs = {"enable_thinking":false} ?

xspider2000@reddit

very interesting. can u also test qwen3.5-120b-a10b?

Look_0ver_There@reddit

Not sure why you're testing with IQ3_XXS on a 128GB machine, but you should also at least turn on flash-attention (-fa 1)

unbannedfornothing@reddit

why 512 threads?

JamesEvoAI@reddit

Strix Halo is pretty great, but I have to imagine that's got to be getting bottlenecked by thermals in a laptop form factor. I have yet to hear the fans on my Framework Desktop

Willing-Toe1942@reddit (OP)

thermals is good so far (max of 75) after about 1 hour of continuous testing mixed between 3 power levels.

and yeah the fans will going crazy high once enabled performance mode I recorded the noise up to 50 db

asfbrz96@reddit

Why using iq3 with 128gb of memory tho

I was doing a comparison between IQ3 and Unsloth UD so started with the smallest one and larger will be tested later