SmolLM3 has day-0 support in MistralRS!
Posted by EricBuehler@reddit | LocalLLaMA | View on Reddit | 5 comments
It's a SoTA 3B model with hybrid reasoning and 128k context.
Hits ⚡105 T/s withAFQ4 @ M3 Max.
Link: https://github.com/EricLBuehler/mistral.rs
Using MistralRS means that you get
- Builtin MCP client
- OpenAI HTTP server
- Python & Rust APIs
- Full multimodal inference engine (in: image, audio, text in, out: image, audio, text).
Super easy to run:
./mistralrs_server -i run -m HuggingFaceTB/SmolLM3-3B
What's next for MistralRS? Full Gemma 3n support, multi-device backend, and more. Stay tuned!
https://reddit.com/link/1luy32e/video/kkojaflgdpbf1/player
uhuge@reddit
Is https://pypi.org/project/mistralrs/ the easiest way to test this on Linux( Ubuntu)?
EricBuehler@reddit (OP)
Not yet, the release is not out yet! The python package should be installed based on any GPU or CPU acceleration you have available - mistralrs-cuda, mistralrs-mkl, mistralrs-metal, etc.
Will be in a few days for Gemma 3n. Check back then, or you can install from source!
TonsillarRat6@reddit
I'm curious if this has been updated already??
Green-Ad-3964@reddit
Is this good for RAG operations?
EricBuehler@reddit (OP)
Absolutely! The long-context + tool calling + reasoning are all great factors.