Sipeed's K3 RISC-V SBCs can run 30B-parameter LLMs 60 TOPS (INT4), Supports BF16/FP16/INT4
Posted by MundanePercentage674@reddit | LocalLLaMA | View on Reddit | 8 comments
Toastti@reddit
Looks like the cpu cores on this board are slower than an. Intel core 2 duo in single core performance.
Guess the vector math cores will help here. But I'm guessing the 15 tokens/s they say qwen 3.5 35B runs at is at very low context. This thing seems very slow.
On a raspberry pi 16GB you can fit qwen 35b 2 bit quant. About 4 tokens a second
sleepingsysadmin@reddit
$600 for 15TPS on 35b?
lol?
genericgod@reddit
My 8 year old i5 mini pc has the same speed. What?
Hot-Employ-3399@reddit
Worse . It's "Up to 15 tps". Translating to normal "the moment context size > 5 tokens, you fucked"
cafedude@reddit
"Up To 32 GB Memory Support" lol
FullOf_Bad_Ideas@reddit
I'm curious about power efficiency, it's not mentioned but it should be great.
I think this sort of hardware is meant to be deployed on edge, for example in shopping mall info kiosk, in the train or some sort of waiting room - it's not meant for consumers. I think the price is fine if their inference framework will be maintained.
Several-Tax31@reddit
Not bad, but a bit expensive? A potato with cpu only setup can run qwen 3.5 moe's for those kind of t/s. If they upgrade to run sota models like kimi or glm, I would obviously buy one of those (probably unlikely) But overall, I'm very happy with all kinds of harware advancements after RAM/SSD shortages.
Lemondifficult22@reddit
Holy fk