On the ASUS ROG Flow Z13 128GB (2025): How many tok/sec on LM Studio using Gemma 4 26B A4B MoE with a one sentence question?

Posted by br_web@reddit | LocalLLaMA | View on Reddit | 2 comments

Question: What is an LLM?

For how many seconds it thought?
How many tokens/sec?
How many tokens?
Elapsed time?

Thanks

[-]

Linkpharm2@reddit

For comparison, 4080 at q3_s: 110t/s, roughly 1300 tokens.

[-]

Middle_Bullfrog_6173@reddit

Eh, why not. Thought for 18.07s, 40.94s total time, 1440 tokens, 35.15 tokens/s.

This is Q8, I have been waiting for the dust to settle before testing anything smaller. Smaller would be faster since it's mostly bandwidth limited. I'm running test workloads on it with 4 slots and getting about double the throughput.