On the ASUS ROG Flow Z13 128GB (2025): How many tok/sec on LM Studio using Gemma 4 26B A4B MoE with a one sentence question?
Posted by br_web@reddit | LocalLLaMA | View on Reddit | 2 comments
Question: What is an LLM?
- For how many seconds it thought?
- How many tokens/sec?
- How many tokens?
- Elapsed time?
Thanks
Linkpharm2@reddit
For comparison, 4080 at q3_s: 110t/s, roughly 1300 tokens.
Middle_Bullfrog_6173@reddit
Eh, why not. Thought for 18.07s, 40.94s total time, 1440 tokens, 35.15 tokens/s.
This is Q8, I have been waiting for the dust to settle before testing anything smaller. Smaller would be faster since it's mostly bandwidth limited. I'm running test workloads on it with 4 slots and getting about double the throughput.