XMasterrrr

GPU Memory Math for LLMs (2026 Edition)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 5 comments
AMA Announcement: Nous Research, The Opensource Lab Behind Hermes Agent (Wednesday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 18 comments
Stop Wasting Your Multi-GPU Setup With llama.cpp: Use vLLM or ExLlamaV2 for Tensor Parallelism

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 112 comments
AMA Announcement: StepFun AI, The Opensource Lab Behind Step-3.5-Flash Model (Thursday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 14 comments
AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2 + Gifts to Our Community (Wednesday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 9 comments
AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2.5 SoTA Model (Friday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 28 comments
MiniMax onX: Weights dropping REALLY, REALLY, SOON

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 10 comments
AMA Announcement: Moonshot AI, The Opensource Frontier Lab Behind Kimi K2.5 SoTA Model (Wednesday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 4 comments
AMA Announcement: Z.ai, The Opensource Lab Behind GLM-4.7 (Tuesday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 6 comments
AMA With Z.AI, The Lab Behind GLM Models

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 354 comments
AMA Announcement: Moonshot AI, The Opensource Frontier Lab Behind Kimi K2 Thinking SoTA Model (Monday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 47 comments
Home Server Final Boss: 14x RTX 3090 Build

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 291 comments
AMA Announcement: Prime Intellect — The Open‑Source Distributed Training Lab (Thu, Oct 2 • 10 AM – 1 PM PDT)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 3 comments
Now I need to explain this to her...

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 517 comments
Our 4th AMA: The LMStudio Team! (Thursday, 11 AM-1 PM PDT)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 4 comments
Our 3rd AMA: Unsloth Team, Creators of the lightning-fast Unsloth fine-tuning library! (Wednesday, 10 AM-1 PM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 28 comments
Our 2nd AMA: Hugging Face Science Team, Creators of SmolLM, SmolVLM, and more! (Tomorrow, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 10 comments
GLM-4.5 is now leading the Berkeley Function-Calling Leaderboard V4, Beating Opus 4

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 32 comments
Launching Our New AMA Series With Z.AI, Creators of GLM (Tomorrow, 9AM-12PM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 29 comments
Qwen 2.5 (7B/14B/32B) Finetunes Outperforming Opus 4 & Sonnet 4/3.5 on Out-of-Distribution Tasks with RL --- Code, Weights, Data, and Paper Released

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 29 comments
DFLoat11 Quantization for Qwen-Image Drops – Run It on 17GB VRAM with CPU Offloading!

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 18 comments
I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments
So You Want to Learn LLMs? Here's the Roadmap

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 1 comments
The Scariest Thing In LLMs/AI Isn't the Models or the Math... It's the Names.

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 23 comments
o3-mini won the poll! We did it guys!

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 245 comments
TraceBack: A Novel Reverse Reasoning Model for Better and Cheaper Scaling of Synthetic Reasoning Generation

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 17 comments
TraceBack: Novel Reverse Reasoning for Better and Cheaper Scaling of Synthetic Reasoning Generation

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 1 comments
I Live-Streamed DeepSeek R-1 671B-q4 Running w/ KTransformers on Epyc 7713, 512GB RAM, and 14x RTX 3090s

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 108 comments
DeepSeek silently released their DeepSeek-Coder-V2-Instruct-0724, which ranks #2 on Aider LLM Leaderboard, and it beats DeepSeek V2.5 according to the leaderboard

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 42 comments
DeepSeek API: Every Request Is A Timeout :(

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 106 comments
LocalLLaMA Home Server Final Boss: 14x RTX 3090 Build

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 1 comments
Petition to auto-delete anything that mentions Matt Shumer, "Reflection", or any link to his Twitter or any affiliated Twitter accounts (Sahil, etc)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 46 comments
Serving AI From The Basement — Part II: Unpacking SWE Agentic Framework, MoEs, Batch Inference, and More · Osman's Odyssey: Byte & Build

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 7 comments
Serving AI From The Basement - 192GB of VRAM Setup

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 72 comments
Serving AI From The Basement - 192GB of VRAM Setup

Posted by XMasterrrr@reddit | programming | View on Reddit | 10 comments
Serving AI From The Basement - Intro to My 192 GB VRAM Setup · Osman's Odyssey

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 1 comments