Xhehab_
-
Qwen2-Vl-2B and Qwen2-VL-7B under Apache 2.0 license released!!
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 5 comments
-
DeepSeek Vision/Multimodal π
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Distillation when you do it. Training when we do it.
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 226 comments
-
Llama 4 is going to be SOTA
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 254 comments
-
LongCat-Flash-Thinking
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 26 comments
-
DeepSeek-V3.1-Terminus
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 0 comments
-
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching [Best OS TTS Yet!]
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 73 comments
-
Meta Set to Release Llama 4 This Month, per The Information & Reuters
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 76 comments
-
FishSpeech v1.5 - multilingual, zero-shot instant voice cloning, low-latency Only 500M params - #2 ranked on TTS-Arena
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 23 comments
-
Qwen3- Coder π
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 202 comments
-
OpenAI GPT OSS: 21B & 117B models (3.6B & 5.1B active)
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 8 comments
-
OpenAI GPT OSS 21B and 117B total parameters, with 3.6B and 5.1B active parameters [Apache 2.0, with a small complementary use policy]
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Qwen-Image β a 20B MMDiT model
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 24 comments
-
DeepSeek R1 0528 Ties Claude Opus 4 for #1 in WebDev Arena β [Ranks #6 Overall, #2 in Coding, #4 in Hard Prompts, & #5 in Math]
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 40 comments
-
DeepSeek R1 0528 Hits 71% (+14.5 pts from R1) on Aider Polyglot Coding Leaderboard
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 108 comments
-
DeepSeek-R1-0528 Official Benchmarks Released!!!
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 155 comments
-
DeepSeek-R1-0528 π₯
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 104 comments
-
Zonos-v0.1 beta by Zyphra, featuring two expressive and real-time text-to-speech (TTS) models with high-fidelity voice cloning. 1.6B transformer and 1.6B hybrid under an Apache 2.0 license.
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 146 comments
-
R1+Sonnet set a new SOTA on the aider polyglot benchmark, at 14X less cost compared to o1
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 46 comments
-
C4AI Command A 111B
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 9 comments
-
π¨π³ Sources: DeepSeek is speeding up the release of its R2 AI model, which was originally slated for May, but the company is now working to launch it sooner.
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 140 comments
-
LlamaCon on April 29: Meta to share the latest on Open Source AI developments
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Anthropic CEO is coping and seething over DeepSeek
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 130 comments
-
Llama 4 Models are Training on a Cluster Bigger Than 100K H100βs: Launching early 2025 with new modalities, stronger reasoning & much faster
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 212 comments
-
TΓΌlu 3 -- a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 42 comments
-
Zamba 2 2.7B & 1.2B Instruct - Mamba 2 based & Apache 2.0 licensed - beats Gemma 2 2.6B & Mistral 7B Instruct-v0.1
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 32 comments
-
Qwen2-Vl-2B and Qwen2-VL-7B under Apache 2.0 released!!
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 0 comments
-
MiniCPM-V 2.6 Now Works with KoboldCpp (+Setup Guide)
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 26 comments
-
Zamba2-2.7B > Outperforms Phi2 2.7B, Danube3 4B, and StableLM 3B
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 14 comments
-
WizardLM 3 is coming soon ππ₯
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 83 comments
-
Scale AI are introducing high quality arenas, with... - private datasets (=can't be gamed) - paid annotators for the rankings (=fairer and higher quality annotations)
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 34 comments
-
β Release WizardCoder 13B, 3B, and 1B models!
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 2 comments
-
gpt2-chatbot might be Phi-3 14B (medium)!! Dropping in a couple weeks with 7B (small) too!
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 90 comments
-
WizardLM-2
Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 2 comments