Running SmolLM Instruct on-device in six different ways

Posted by hackerllama@reddit | LocalLLaMA | View on Reddit | 4 comments

Hi all! Chief Llama Officer from HF here 🫡🦙 The team went a bit wild during the weekend and decided to release on Sunday SmolLM Instruct V0.2 , which are 135M, 360M, and 1.7B instruct models with Apache 2.0 license and open fine-tuning scripts and data so anyone can reproduce. Of course, the models are great for running on-device. Here are six ways to try them out 1. Instant SmolLM using MLC with real-time generation. Try it running on the web (but locally!) [here](https://huggingface.co/spaces/HuggingFaceTB/instant-smollm). 2. Run in the browser with WebGPU (if you have a supported browser) with transformers.js [here](https://huggingface.co/spaces/HuggingFaceTB/SmolLM-360M-Instruct-WebGPU). 3. If you don't have WebGPU, you can use Wllama which uses GGUF and WebAssembly to run in the browser, as you can try [here](https://huggingface.co/spaces/ngxson/wllama) 4. You can also try out the base model through the [SmolPilot demo](https://huggingface.co/spaces/cfahlgren1/SmolPilot) 5. If you're more of the interactive running folks, you can try this two-line setup `pip install trl` `trl chat --model_name_or_path HuggingFaceTB/smollm-360M-instruct --device cpu` 1. The good ol' reliable llama.cpp All models + MLC/GGUF/ONNX formats can be found at [https://huggingface.co/collections/HuggingFaceTB/local-smollms-66c0f3b2a15b4eed7fb198d0](https://huggingface.co/collections/HuggingFaceTB/local-smollms-66c0f3b2a15b4eed7fb198d0) Let's go! 🚀

4 Comments

[-]

Ill-Still-6859@reddit

for phone devices, there are these, too: **PocketPal AI:** \- iOS: [https://apps.apple.com/us/app/pocketpal-ai/id6502579498](https://apps.apple.com/us/app/pocketpal-ai/id6502579498) \- Android: [https://play.google.com/store/apps/details?id=com.pocketpalai&pcampaignid=web\_share](https://play.google.com/store/apps/details?id=com.pocketpalai&pcampaignid=web_share) \- source: [https://github.com/a-ghorbani/pocketpal-ai](https://github.com/a-ghorbani/pocketpal-ai) **ChatterUI**: [https://github.com/Vali-98/ChatterUI](https://github.com/Vali-98/ChatterUI)

codenamev@reddit

I \_really\_ ❤️ you folks. Thank you for this! Got stuck on a few issues fine-tuning SmolLM and moved to Phi, but will give this another go. Any suggestions/guidelines for sourcing training data for code? Trying to get a good pipeline for Ruby.

loubnabnl@reddit

The current SmolLM models are primarily trained on Python, we will include more languages in the next iteration. For Ruby you might have better luck with small code models such as [https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) and [https://huggingface.co/bigcode/starcoder2-3b](https://huggingface.co/bigcode/starcoder2-3b) and [https://huggingface.co/google/codegemma-2b](https://huggingface.co/google/codegemma-2b)

estrafire@reddit

From which local webapp should we expect better performance on modern high end devices?

Reply to Post

4 Comments