Sipeed's K3 RISC-V SBCs can run 30B-parameter LLMs 60 TOPS (INT4), Supports BF16/FP16/INT4

Posted by MundanePercentage674@reddit | LocalLLaMA | View on Reddit | 8 comments

https://wccftech.com/sipeed-crams-32gb-lpddr5-60-tops-npu-compact-risc-v-board-hits-15-tokens-s-ai-llms/

[-]

Toastti@reddit

Looks like the cpu cores on this board are slower than an. Intel core 2 duo in single core performance.

Guess the vector math cores will help here. But I'm guessing the 15 tokens/s they say qwen 3.5 35B runs at is at very low context. This thing seems very slow.

On a raspberry pi 16GB you can fit qwen 35b 2 bit quant. About 4 tokens a second

[-]

genericgod@reddit

My 8 year old i5 mini pc has the same speed. What?

[-]

Hot-Employ-3399@reddit

Worse . It's "Up to 15 tps". Translating to normal "the moment context size > 5 tokens, you fucked"

[-]

FullOf_Bad_Ideas@reddit

I'm curious about power efficiency, it's not mentioned but it should be great.

I think this sort of hardware is meant to be deployed on edge, for example in shopping mall info kiosk, in the train or some sort of waiting room - it's not meant for consumers. I think the price is fine if their inference framework will be maintained.

[-]

Several-Tax31@reddit

Not bad, but a bit expensive? A potato with cpu only setup can run qwen 3.5 moe's for those kind of t/s. If they upgrade to run sota models like kimi or glm, I would obviously buy one of those (probably unlikely) But overall, I'm very happy with all kinds of harware advancements after RAM/SSD shortages.

[-]

Lemondifficult22@reddit

Holy fk