Xhehab_

Qwen2-Vl-2B and Qwen2-VL-7B under Apache 2.0 license released!!

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 5 comments
DeepSeek Vision/Multimodal 👀

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 0 comments
Distillation when you do it. Training when we do it.

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 226 comments
Llama 4 is going to be SOTA

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 254 comments
LongCat-Flash-Thinking

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 26 comments
DeepSeek-V3.1-Terminus

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 0 comments
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching [Best OS TTS Yet!]

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 73 comments
Meta Set to Release Llama 4 This Month, per The Information & Reuters

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 76 comments
FishSpeech v1.5 - multilingual, zero-shot instant voice cloning, low-latency Only 500M params - #2 ranked on TTS-Arena

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 23 comments
Qwen3- Coder 👀

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 202 comments
OpenAI GPT OSS: 21B & 117B models (3.6B & 5.1B active)

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 8 comments
OpenAI GPT OSS 21B and 117B total parameters, with 3.6B and 5.1B active parameters [Apache 2.0, with a small complementary use policy]

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 0 comments
Qwen-Image — a 20B MMDiT model

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 24 comments
DeepSeek R1 0528 Ties Claude Opus 4 for #1 in WebDev Arena — [Ranks #6 Overall, #2 in Coding, #4 in Hard Prompts, & #5 in Math]

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 40 comments
DeepSeek R1 0528 Hits 71% (+14.5 pts from R1) on Aider Polyglot Coding Leaderboard

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 108 comments
DeepSeek-R1-0528 Official Benchmarks Released!!!

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 155 comments
DeepSeek-R1-0528 🔥

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 104 comments
Zonos-v0.1 beta by Zyphra, featuring two expressive and real-time text-to-speech (TTS) models with high-fidelity voice cloning. 1.6B transformer and 1.6B hybrid under an Apache 2.0 license.

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 146 comments
R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost compared to o1

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 46 comments
C4AI Command A 111B

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 9 comments
🇨🇳 Sources: DeepSeek is speeding up the release of its R2 AI model, which was originally slated for May, but the company is now working to launch it sooner.

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 140 comments
LlamaCon on April 29: Meta to share the latest on Open Source AI developments

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 10 comments
Anthropic CEO is coping and seething over DeepSeek

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 130 comments
Llama 4 Models are Training on a Cluster Bigger Than 100K H100’s: Launching early 2025 with new modalities, stronger reasoning & much faster

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 212 comments
Tülu 3 -- a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 42 comments
Zamba 2 2.7B & 1.2B Instruct - Mamba 2 based & Apache 2.0 licensed - beats Gemma 2 2.6B & Mistral 7B Instruct-v0.1

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 32 comments
Qwen2-Vl-2B and Qwen2-VL-7B under Apache 2.0 released!!

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 0 comments
MiniCPM-V 2.6 Now Works with KoboldCpp (+Setup Guide)

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 26 comments
Zamba2-2.7B > Outperforms Phi2 2.7B, Danube3 4B, and StableLM 3B

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 14 comments
WizardLM 3 is coming soon 👀🔥

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 83 comments
Scale AI are introducing high quality arenas, with... - private datasets (=can't be gamed) - paid annotators for the rankings (=fairer and higher quality annotations)

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 34 comments
✅Release WizardCoder 13B, 3B, and 1B models!

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 2 comments
gpt2-chatbot might be Phi-3 14B (medium)!! Dropping in a couple weeks with 7B (small) too!

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 90 comments
WizardLM-2

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 2 comments