Iory1998
-
If You Want to Understand Why Llama Models Flopped, Zuck is the Cause!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 213 comments
-
Tutorial - How to Toggle On/OFf the Thinking Mode Directly in LM Studio for Any Thinking Model
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 32 comments
-
Best Practices to Start with Vibe Coding? Best Local Apps for Agentic Vibe Coding?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 18 comments
-
My biggest Issue with the Gemma-4 Models is the Massive KV Cache!!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 162 comments
-
Where is DeepSeek R2?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 33 comments
-
A Reminder, Guys, Undervolt your GPUs Immediately. You will Significantly Decrease Wattage without Hitting Performance.
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 68 comments
-
Just Use System Prompt to Curtail Sycophancy!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 16 comments
-
Alibaba launches AI platform for enterprises as agent craze sweeps China
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Qwen3.5 Model Series - Thinking On/OFF: Does it Matter?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 39 comments
-
Do not Let the "Coder" in Qwen3-Coder-Next Fool You! It's the Smartest, General Purpose Model of its Size
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 202 comments
-
Have Anyone Successfully Run the New MiniCPM-o-4_5-gguf?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 14 comments
-
Unsloth Team: We Need to Talk!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 36 comments
-
MANUS - I Requested a Trial and got an Invitation 6 Hours Later!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 20 comments
-
Why can't Ollama just run GGFU models directly downloaded from HF?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 50 comments
-
Kimi-Linear-48B-A3B-Instruct-GGUF Support - Any news?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 37 comments
-
GPT-OSS is Another Example Why Companies Must Build a Strong Brand Name
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 419 comments
-
Flux.1 Quantization Quality: BNB nf4 vs GGUF-Q8 vs FP16
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 94 comments
-
Why doesn't Groq Sell its LPUs? By Extension, Why doesn't Google do that?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 51 comments
-
A Great Breakdown of the "Disney vs Midjourney" Lawsuit Case
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 12 comments
-
The Attention Hybrid MoE Architecture is the Future. Now, AI Labs Should Dedicate Resources to Improve Long Context Recall Capabilities.
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 60 comments
-
Any Mixtral-8x7B with Longer Context Window than the default 32K?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 9 comments
-
A Tribute to MetaAI and Stability AI - 2 Giants Who Brought us so Much Joy... And, 2025 is the Year they Die... So Sad!😢
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 59 comments
-
ComfyUI for LLMs: Making the Case for a Universal, Node-Based Backend for Local LLM Development
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 36 comments
-
What's the Status of GGUF quantization of Qwen3-Next-80B-A3B-Instruct?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Qwen3-30B-A3B-2507-Q4_K_L Is the First Local Model to Solve the North Pole Walk Puzzle
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 84 comments
-
Qwen3-Next-80B-GGUF, Any Update?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 17 comments
-
To The Qwen Team, Kindly Contribute to Qwen3-Next GGUF Support!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 128 comments
-
Are We Really Getting the Best of the Current Models? Discussion!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 20 comments
-
Why Do I Feel Poor Each Time I Decide to Buy a New GPU Even Though I Make More Money?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 122 comments
-
Round Up: Current Best Local Models under 40B for Code & Tool Calling, General Chatting, Vision, and Creative Story Writing.
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 30 comments
-
Why aren't there Any Gemma-3 Reasoning Models?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 39 comments
-
OpenAI wins $200 million U.S. defense contract!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 104 comments
-
KwaiCoder-AutoThink-preview is a Good Model for Creative Writing! Any Idea about Coding and Math? Your Thoughts?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Disney and Universal sue AI image company Midjourney for unlicensed use of Star Wars, The Simpsons and more
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 203 comments
-
I am Not Impressed by Llama-3.1 Supposedly Long Context Window
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 17 comments
-
Meta AI could have Just Released Small Variants for Llama-4 and Focus on Llama-5!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 25 comments
-
Finally, found her!❤️🌹
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 22 comments
-
Any Update on the HF's Open R1 Project?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 2 comments
-
This is the Reason why I am Still Debating whether to buy RTX5090!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 65 comments
-
Quadro RTX8000 vs RTX4090 vs RTX5090, which is better for Generative Models?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 12 comments
-
How fast is Threadripper 5995WX for Inference instead of a GPU?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 29 comments
-
Chatbot Arena's Leadership Board fot T2I Makes no Sense!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Instead of the Needle-in-the-Haystack Test, Let's Test the Model's Reading Comprehension!!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 65 comments
-
An Interesting Watch: DeepSeek vs. Open AI - The State of AI w/ Emad Mostaque & Salim Ismail
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Guys, Use the LongWriter-llama3.1-8b instead of Llama3.1-8b!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 23 comments
-
I'd Rather have Llama-2 with 65K+ Context Size than Llama-3 or Gemma-2 with their Meager 8K!
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 43 comments
-
For Gemma-2-2b, You can Extend the Windows Context to 32K+
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 17 comments
-
Using KV Cache, Do You Notice any Quality Drop?
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 19 comments
-
Until the RoPE Scaling is Fixed in GGUF for Llama-3.1 models, Just Use this Frequency (Tested up to 80K)
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 45 comments
-
A Model Trained Extensively on Math can Hinder it's Reasoning Capabilities, and Here is an example.
Posted by Iory1998@reddit | LocalLLaMA | View on Reddit | 29 comments