Iory1998

If You Want to Understand Why Llama Models Flopped, Zuck is the Cause!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 213 comments
Tutorial - How to Toggle On/OFf the Thinking Mode Directly in LM Studio for Any Thinking Model

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 32 comments
Best Practices to Start with Vibe Coding? Best Local Apps for Agentic Vibe Coding?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 18 comments
My biggest Issue with the Gemma-4 Models is the Massive KV Cache!!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 162 comments
Where is DeepSeek R2?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 33 comments
A Reminder, Guys, Undervolt your GPUs Immediately. You will Significantly Decrease Wattage without Hitting Performance.

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 68 comments
Just Use System Prompt to Curtail Sycophancy!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 16 comments
Alibaba launches AI platform for enterprises as agent craze sweeps China

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 1 comments
Qwen3.5 Model Series - Thinking On/OFF: Does it Matter?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 39 comments
Do not Let the "Coder" in Qwen3-Coder-Next Fool You! It's the Smartest, General Purpose Model of its Size

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 202 comments
Have Anyone Successfully Run the New MiniCPM-o-4_5-gguf?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 14 comments
Unsloth Team: We Need to Talk!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 36 comments
MANUS - I Requested a Trial and got an Invitation 6 Hours Later!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 20 comments
Why can't Ollama just run GGFU models directly downloaded from HF?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 50 comments
Kimi-Linear-48B-A3B-Instruct-GGUF Support - Any news?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 37 comments
GPT-OSS is Another Example Why Companies Must Build a Strong Brand Name

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 419 comments
Flux.1 Quantization Quality: BNB nf4 vs GGUF-Q8 vs FP16

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 94 comments
Why doesn't Groq Sell its LPUs? By Extension, Why doesn't Google do that?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 51 comments
A Great Breakdown of the "Disney vs Midjourney" Lawsuit Case

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 12 comments
The Attention Hybrid MoE Architecture is the Future. Now, AI Labs Should Dedicate Resources to Improve Long Context Recall Capabilities.

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 60 comments
Any Mixtral-8x7B with Longer Context Window than the default 32K?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 9 comments
A Tribute to MetaAI and Stability AI - 2 Giants Who Brought us so Much Joy... And, 2025 is the Year they Die... So Sad!😢

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 59 comments
ComfyUI for LLMs: Making the Case for a Universal, Node-Based Backend for Local LLM Development

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 36 comments
What's the Status of GGUF quantization of Qwen3-Next-80B-A3B-Instruct?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 4 comments
Qwen3-30B-A3B-2507-Q4_K_L Is the First Local Model to Solve the North Pole Walk Puzzle

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 84 comments
Qwen3-Next-80B-GGUF, Any Update?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 17 comments
To The Qwen Team, Kindly Contribute to Qwen3-Next GGUF Support!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 128 comments
Are We Really Getting the Best of the Current Models? Discussion!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 20 comments
Why Do I Feel Poor Each Time I Decide to Buy a New GPU Even Though I Make More Money?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 122 comments
Round Up: Current Best Local Models under 40B for Code & Tool Calling, General Chatting, Vision, and Creative Story Writing.

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 30 comments
Why aren't there Any Gemma-3 Reasoning Models?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 39 comments
OpenAI wins $200 million U.S. defense contract!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 104 comments
KwaiCoder-AutoThink-preview is a Good Model for Creative Writing! Any Idea about Coding and Math? Your Thoughts?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 4 comments
Disney and Universal sue AI image company Midjourney for unlicensed use of Star Wars, The Simpsons and more

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 203 comments
I am Not Impressed by Llama-3.1 Supposedly Long Context Window

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 17 comments
Meta AI could have Just Released Small Variants for Llama-4 and Focus on Llama-5!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 25 comments
Finally, found her!❤️🌹

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 22 comments
Any Update on the HF's Open R1 Project?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 2 comments
This is the Reason why I am Still Debating whether to buy RTX5090!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 65 comments
Quadro RTX8000 vs RTX4090 vs RTX5090, which is better for Generative Models?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 12 comments
How fast is Threadripper 5995WX for Inference instead of a GPU?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 29 comments
Chatbot Arena's Leadership Board fot T2I Makes no Sense!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 0 comments
Instead of the Needle-in-the-Haystack Test, Let's Test the Model's Reading Comprehension!!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 65 comments
An Interesting Watch: DeepSeek vs. Open AI - The State of AI w/ Emad Mostaque & Salim Ismail

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 0 comments
Guys, Use the LongWriter-llama3.1-8b instead of Llama3.1-8b!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 23 comments
I'd Rather have Llama-2 with 65K+ Context Size than Llama-3 or Gemma-2 with their Meager 8K!

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 43 comments
For Gemma-2-2b, You can Extend the Windows Context to 32K+

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 17 comments
Using KV Cache, Do You Notice any Quality Drop?

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 19 comments
Until the RoPE Scaling is Fixed in GGUF for Llama-3.1 models, Just Use this Frequency (Tested up to 80K)

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 45 comments
A Model Trained Extensively on Math can Hinder it's Reasoning Capabilities, and Here is an example.

Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 29 comments