XMasterrrr

AMA Announcement: Nous Research, The Opensource Lab Behind Hermes Agent (Wednesday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 18 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for Wednesday's guests, **The Nous Research Team!** **Kicking things off Wednesday, April. 29th, 8 AM–11 AM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

AMA Announcement: StepFun AI, The Opensource Lab Behind Step-3.5-Flash Model (Thursday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 14 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for Thursday's guests: **The StepFun Team!** **Kicking things off Thursday, Feb. 19th, 8 AM–11 AM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2.5 SoTA Model (Friday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 28 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for Friday's guests: **The Core Team of MiniMax Lab and The Lab’s Founder!** **Kicking things off Friday, Feb. 13th, 8 AM–11 AM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

AMA Announcement: Moonshot AI, The Opensource Frontier Lab Behind Kimi K2.5 SoTA Model (Wednesday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 4 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for Wednesday's guests, **The Moonshot AI Lab Team!** **Kicking things off Wednesday, Jan. 28th, 8 AM–11 PM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

AMA Announcement: Z.ai, The Opensource Lab Behind GLM-4.7 (Tuesday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 6 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for tomorrow's guests, **The Z.ai Lab Team!** **Kicking things off Tuesday, Dec. 23rd, 8 AM–11 PM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2 + Gifts to Our Community (Wednesday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 9 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We’re excited for Wednesday’s guests, **MiniMax-M2 Team!** They’ll also be **gifting MiniMax‑M2 Max Coding Plans** to the top 10 most upvoted AMA questions or comments, plus a couple of extra winners chosen by the AMA hosts. **Kicking things off Wednesday, Nov. 19th, 8 AM–11 PM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model

Posted by nekofneko@reddit | LocalLLaMA | View on Reddit | 375 comments

[-]

XMasterrrr@reddit

There has been this rumor that [Kimi K2 Thinking costed only $4.6 to train,](https://x.com/Yuchenj_UW/status/1986858369077113277?t=4VOOFDuvNobjKIIAal9qIg&s=19) how accurate is that figure?

AMA Announcement: Moonshot AI, The Opensource Frontier Lab Behind Kimi K2 Thinking SoTA Model (Monday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 47 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for Monday's guests, **The Moonshot AI Lab Team!** **Kicking things off Monday, Nov. 10th, 8 AM–11 PM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

AMA Announcement: Prime Intellect — The Open‑Source Distributed Training Lab (Thu, Oct 2 • 10 AM – 1 PM PDT)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 3 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for tomorrow's guests, **The Prime Intellect Team!** **Kicking things off tomorrow (Thursday, Oct. 2nd) 10 AM–1 PM PDT** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

Our 4th AMA: The LMStudio Team! (Thursday, 11 AM-1 PM PDT)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 4 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for tomorrow's guests, **The LMStudio Team!** **Kicking things off tomorrow (Thursday, Sept. 10th) 11 AM–1 PM PDT** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

Our 3rd AMA: Unsloth Team, Creators of the lightning-fast Unsloth fine-tuning library! (Wednesday, 10 AM-1 PM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 28 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for tomorrow's guests, **The Unsloth Team!** They're the folks behind the blazing-fast Unsloth fine-tuning library and a slew of community notebooks. **Kicking things off tomorrow (Wednesday, Sept. 10th) 10 AM–1 PM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

Our 2nd AMA: Hugging Face Science Team, Creators of SmolLM, SmolVLM, and more! (Tomorrow, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

XMasterrrr@reddit (OP)

Thank you :)

Our 2nd AMA: Hugging Face Science Team, Creators of SmolLM, SmolVLM, and more! (Tomorrow, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

XMasterrrr@reddit (OP)

Thank you for accepting our invitation and having your team dedicate the time for our community, Elie!

Our 2nd AMA: Hugging Face Science Team, Creators of SmolLM, SmolVLM, and more! (Tomorrow, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for tomorrow's guests, **The Hugging Face Science Team!** They're the creators of SmolLM, SmolVLM, Fineweb, and more! **Kicking things off tomorrow (Thursday, Sept. 3rd) 8AM–11AM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

Launching Our New AMA Series With Z.AI, Creators of GLM (Tomorrow, 9AM-12PM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 29 comments

[-]

XMasterrrr@reddit (OP)

Launching Our New AMA Series With Z.AI, Creators of GLM (Tomorrow, 9AM-12PM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 29 comments

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 Ahmad here, one of your new mods. We're excited to finally roll out an **AMA series** we've been cooking up behind the scenes. Some of the names lined up include: * **Z.AI** * **Hugging Face** * **Unsloth** * **LMStudio** * **Prime Intellect** We're thrilled to bring these conversations to the community and can't wait for your participation. **Kicking things off tomorrow (Thursday 28th) from 9AM–12PM PST with Z.AI!** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

Qwen 2.5 (7B/14B/32B) Finetunes Outperforming Opus 4 & Sonnet 4/3.5 on Out-of-Distribution Tasks with RL --- Code, Weights, Data, and Paper Released

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 29 comments

[-]

XMasterrrr@reddit (OP)

Thank you!

Qwen 2.5 (7B/14B/32B) Finetunes Outperforming Opus 4 & Sonnet 4/3.5 on Out-of-Distribution Tasks with RL --- Code, Weights, Data, and Paper Released

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 29 comments

[-]

XMasterrrr@reddit (OP)

Doesn't change the fact that a small and specialized model is not only going head-to-head but outperforming SoTA frontier models. I should have said `Task` instead of `Tasks`, but in general this formula also generalizes, so it is true if you do the work.

Qwen 2.5 (7B/14B/32B) Finetunes Outperforming Opus 4 & Sonnet 4/3.5 on Out-of-Distribution Tasks with RL --- Code, Weights, Data, and Paper Released

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 29 comments

[-]

XMasterrrr@reddit (OP)

**qqWen: Fully Open-Source Models for Q Financial Programming Language (Code, Weights, Data, Report)** Open-source project for finetuning LLMs (pretraining, SFT, RL) on the Q financial language. They’re sharing everything—code, model weights, training data, and a detailed technical report. Model sizes: 1.5B, 3B, 7B, 14B, and 32B. Links: * Technical Report: [https://arxiv.org/abs/2508.06813](https://arxiv.org/abs/2508.06813) * Models + Data: [https://huggingface.co/collections/morganstanley/qqwen-series-688e4266bc727e7a3143aacf](https://huggingface.co/collections/morganstanley/qqwen-series-688e4266bc727e7a3143aacf) * Code: [https://github.com/morganstanley/MSML/tree/main/projects/Fullstack\_LLM\_Finetuning\_Q](https://github.com/morganstanley/MSML/tree/main/projects/Fullstack_LLM_Finetuning_Q) Source: [@brendanh0gan](https://x.com/brendanh0gan/status/1955641113693561071) on X/Twitter

DFLoat11 Quantization for Qwen-Image Drops – Run It on 17GB VRAM with CPU Offloading!

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 18 comments

[-]

XMasterrrr@reddit (OP)

In short, if you upload a transparent png file, you can tell it to generate anything since it's empty That's the hack around this, I just had it implemented in a better UX but still haven't gotten around pushing it to the public repo

DFLoat11 Quantization for Qwen-Image Drops – Run It on 17GB VRAM with CPU Offloading!

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 18 comments

[-]

XMasterrrr@reddit (OP)

So, and I had this implemented on private repo, I now have a text2img using the Flux model by generating an empty canvas (transparent png) and having a "system prompt" that instructs it to generate what's being requested on it. Now, with this model I have to think about the different workflows.

DFLoat11 Quantization for Qwen-Image Drops – Run It on 17GB VRAM with CPU Offloading!

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 18 comments

[-]

XMasterrrr@reddit (OP)

I plan on having it implemented into my image gen app that I posted here earlier last month very soon: https://github.com/TheAhmadOsman/4o-ghibli-at-home I also have added a bunch of new features and some cool changes since last I pushed to the public repo, hopefully it'll all be there before the weekend!

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

It should be all good now, migrated to `uv` completely. If you have time to test it that'd be appreciated.

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

Thank you :)

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

I'll try my best to have it supported soon, hopefully before the end of this long weekend

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

On the roadmap

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

I am not aware of any other that is as performant.

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

I have it on the roadmap to add hardware auto-detect and decide which GGUF to use based on that.

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

Thank you! It's on the roadmap

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

Working on adding better hardware detection and picking up proper GGUFs based on that.

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

Unfortunately only Nvidia GPUs are supported at the moment. I'll try to have Apple Silicon on the roadmap.

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

Yes! I have it in the readme that you need to either login with huggingface-cli tool or to grab a token and set that in the .env You probably need to request access to the model itself, which is usually granted immediately, on their huggingface model page https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

One second

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

Thank you 😁

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 77 comments

[-]

XMasterrrr@reddit (OP)

Link to GH Repo: [https://github.com/TheAhmadOsman/4o-ghibli-at-home](https://github.com/TheAhmadOsman/4o-ghibli-at-home)

So You Want to Learn LLMs? Here's the Roadmap

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 1 comments

[-]

XMasterrrr@reddit (OP)

Hey friends, I've been neck-deep in AI and LLMs for a while now, and I kept running into the same problem: every learning resource out there for LLMs either tries to teach you all of deep learning from scratch, or throws you into a sea of random “awesome-LLM” repos, hoping you can connect the dots yourself. So, I wrote up the roadmap I wish existed a year ago: how to actually learn LLMs and build real things, with none of the bloat. It's geared towards folks with a CS (or practical programming) background who want to skip the endless ML prerequisites and get their hands dirty. The approach: - Concepts first, then phases, then resources/tools - Each phase has concrete projects (build an autograd engine, write a mini-GPT, fine-tune with LoRA, etc) - The goal is to get you actually building and shipping, not just watching lectures Hope you guys find it useful. -Ahmad

Subreddit back in business

Posted by HOLUPREDICTIONS@reddit | LocalLLaMA | View on Reddit | 258 comments

[-]

XMasterrrr@reddit

Hey u/HOLUPREDICTIONS I sent you a PM as I would like to help with moderation.

The Scariest Thing In LLMs/AI Isn't the Models or the Math... It's the Names.

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 23 comments

[-]

XMasterrrr@reddit (OP)

[Dynamic Tensor Rematerialization](https://arxiv.org/abs/2006.09616). No idea what's it about, and too scared to look 😂

The Scariest Thing In LLMs/AI Isn't the Models or the Math... It's the Names.

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 23 comments

[-]

XMasterrrr@reddit (OP)

Guys, I don't understand the downvotes. I literally copied the entire tweet over here so nobody has to click on anything 😅 I am also sharing my findings that there is an active research community over there so that people know to keep their eyes open, I am not advocating for a platform but rather sharing something I think genuinely helpful to the collective knowledge of the members of this community

The Scariest Thing In LLMs/AI Isn't the Models or the Math... It's the Names.

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 23 comments

[-]

XMasterrrr@reddit (OP)

???

The Scariest Thing In LLMs/AI Isn't the Models or the Math... It's the Names.

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 23 comments

[-]

XMasterrrr@reddit (OP)

Hey guys, I haven't posted here in a while, I've been a lot more active over on X, especially since the LLM research scene is much more alive there. Just wanted to cross-post this here as well. I’m the original author of this on [X](https://x.com/TheAhmadOsman/status/1922336545719107759). > the scariest thing in llms/ai isn't the models or the math > it's the names > > > kv cache prefill strategy > > multi-head attention with rotary position embeddings > > fused CUDA kernel for dynamic tensor rematerialization > > nucleus sampling with temperature scaling and repetition penalty > > flash attention v2 with block-sparse operations, causal masking, and warp-level primitives > > bro they sound like boss fights frfr

Is grok 3 really that good?

Posted by giantdickinmyface@reddit | LocalLLaMA | View on Reddit | 8 comments

[-]

XMasterrrr@reddit

Yes. People will deny it because of what's his name but Grok 3 is the current SOTA imho. Check my profile in case you're wondering about my credentials to make such a statement.

TraceBack: A Novel Reverse Reasoning Model for Better and Cheaper Scaling of Synthetic Reasoning Generation

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 17 comments

[-]

XMasterrrr@reddit (OP)

I am posting the below on behalf of u/secemp9, the author of the model, as his Reddit account is only recently created and he could not post it himself.

Can I Run LLMs with Two Different Model GPUs? (4090 + 3090)

Posted by gentritb@reddit | LocalLLaMA | View on Reddit | 3 comments

[-]

XMasterrrr@reddit

Yes you can. In fact, before I built my [AI Server](https://x.com/TheAhmadOsman/status/1869841392924762168) I ran RTX 4090 + 3090. `llama.cpp` will work just fine, `vLLM` is an option too but w/ 48GB of VRAM I'd recommend using `ExLlamaV2`. I have recently written an [blogbost on Inference Engines](https://ahmadosman.com/blog/do-not-use-llama-cpp-or-ollama-on-multi-gpus-setups-use-vllm-or-exllamav2/) that I think might be insightful. Let me know if you have any questions.

KDE Plasma 6.3.1, Bugfix Release for February

Posted by gabriel_3@reddit | linux | View on Reddit | 20 comments

[-]

XMasterrrr@reddit

This update made my system completely broken and unusable. RTX 4090. A downgrade fixed things.

Introduction to CUDA Programming for Python Developers

Posted by Brilliant-Day2748@reddit | LocalLLaMA | View on Reddit | 3 comments

[-]

XMasterrrr@reddit

I enjoyed reading this. Thanks for sharing!

o3-mini won the poll! We did it guys!

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 245 comments

[-]

XMasterrrr@reddit (OP)

I often overwrite and explicitly state everything, even multiple times especially when connections between clauses exist, because I don't know what will my audience understand and what not. Smarter people hate it, and it makes me sound like I am repetitive, but I am just in fact worrying about the lower end of the tail on the other side... Having thick skin when it comes to internet points is important...

o3-mini won the poll! We did it guys!

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 245 comments

[-]

XMasterrrr@reddit (OP)

I really think this dude was trying to make fun of those people by quoting them but the formatting got screwed.