xenovatech

In-browser tool calling playground, running LFM2 locally on WebGPU with Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 2 comments
PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 79 comments
1-bit Bonsai 1.7B (290MB in size) running locally in your browser on WebGPU

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 171 comments
Gemma 4 running fully offline on WebGPU with Transformers.js, controlling Reachy Mini over WebSerial.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 10 comments
New OpenAI Privacy Filter model, running locally in your browser on WebGPU

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 0 comments
Gemma 4 WebGPU: Run Google's new open model locally in your browser

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 2 comments
Kokoro WebGPU: Real-time text-to-speech running 100% locally in your browser.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 103 comments
SAM 3: Segment Anything with Concepts, by Meta Superintelligence Labs

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 22 comments
Cohere Transcribe WebGPU: state-of-the-art multilingual speech recognition in your browser

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 1 comments
Liquid AI's LFM2-24B-A2B running at ~50 tokens/second in a web browser on WebGPU

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 18 comments
Nemotron-3-Nano (4B), new hybrid Mamba + Attention model from NVIDIA, running locally in your browser on WebGPU.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 6 comments
Running Qwen 3.5 0.8B locally in the browser on WebGPU w/ Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 32 comments
Real-time video captioning in the browser with LFM2-VL on WebGPU

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 2 comments
Voxtral WebGPU: Real-time speech transcription entirely in your browser with Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 13 comments
microgpt playground: Build, train, and run LLMs — directly in your browser

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 10 comments
Run LFM2.5-1.2B-Thinking at over 200 tokens per second in your browser on WebGPU

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 11 comments
Text Behind Video: Create cinematic text and video compositions locally in your browser w/ Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 0 comments
GPT-OSS (20B) running 100% locally in your browser on WebGPU

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 26 comments
Supertonic WebGPU: blazingly fast text-to-speech running 100% locally in your browser.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 11 comments
Real-time conversational AI running 100% locally in-browser on WebGPU

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 135 comments
Chatterbox Turbo, new open-source voice AI model, just released on Hugging Face

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 64 comments
FunctionGemma Physics Playground: A simulation game where you need to use natural language to solve physics puzzles... running 100% locally in your browser!

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 17 comments
Ministral WebGPU: Run Mistral's new multimodal models 100% locally in your browser.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 15 comments
Run Qwen3 (0.6B) 100% locally in your browser on WebGPU w/ Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 21 comments
IBM releases Granite-4.0 Nano (300M & 1B), along with a local browser demo showing how the models can programmatically interact with websites and call tools/browser APIs on your behalf.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 42 comments
NanoChat WebGPU: Karpathy's full-stack ChatGPT project running 100% locally in the browser.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 5 comments
Granite Docling WebGPU: State-of-the-art document parsing 100% locally in your browser.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 46 comments
Granite 4.0 Micro (3.4B) running 100% locally in your browser w/ WebGPU acceleration

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 47 comments
In-browser video object detection w/ YOLOv9 and Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 2 comments
Apple releases FastVLM and MobileCLIP2 on Hugging Face, along with a real-time video captioning demo (in-browser + WebGPU)

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 162 comments
DINOv3 semantic video tracking running locally in your browser (WebGPU)

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 19 comments
The Semantic Galaxy: An interactive 3D embedding visualization demo, built with Google's new EmbeddingGemma model

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 11 comments
DINOv3 visualization tool running 100% locally in your browser on WebGPU/WASM

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 34 comments
In-browser background removal w/ RMBG-v1.4 and Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 2 comments
Voxtral WebGPU: State-of-the-art audio transcription directly in your browser!

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 13 comments
Introducing Kokoro.js: a new JavaScript library for running Kokoro TTS (82M) locally in the browser w/ WASM.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 60 comments
WebGPU-accelerated real-time in-browser speech recognition w/ Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 63 comments
I updated the SmolVLM llama.cpp webcam demo to run locally in-browser on WebGPU.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 27 comments
Whisper Timestamped: Multilingual speech recognition w/ word-level timestamps, running locally in your browser using Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 54 comments
Hugging Face Chat Templates, now available in Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 2 comments
Apply formatting to Jinja chat templates directly from the Hugging Face model card (+ new playground)

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 4 comments
ONNX Model Explorer and Visualization Tool

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 0 comments
Whisper Diarization Web: In-browser multilingual speech recognition with word-level timestamps and speaker segmentation

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 32 comments
Introducing Kokoro Web: ML-powered speech synthesis directly in your browser. Now with streaming & WebGPU acceleration.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 19 comments
Running Llama 3.2 100% locally in the browser on WebGPU w/ Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 38 comments
Janus, a new multimodal understanding and generation model from Deepseek, running 100% locally in the browser on WebGPU with Transformers.js!

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 25 comments
Transformers.js v3 is finally out: WebGPU Support, New Models & Tasks, New Quantizations, Deno & Bun Compatibility, and More…

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 26 comments
Janus Pro 1B running 100% locally in-browser on WebGPU, powered by Transformers.js

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 70 comments
DeepSeek-R1-Distill-Qwen-1.5B running 100% locally in-browser on WebGPU. Reportedly outperforms GPT-4o and Claude-3.5-Sonnet on math benchmarks (28.9% on AIME and 83.9% on MATH).

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 49 comments
SmolVLM 256M: The world's smallest multimodal model, running 100% locally in-browser on WebGPU.

Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 13 comments