-
Kimi-K2 🤝 Anthropic | Blog Post by Justin Wong
Posted by LeveredRecap@reddit | LocalLLaMA | View on Reddit | 6 comments
-
Is a heavily quantised Q235b any better than Q32b?
Posted by Secure_Reflection409@reddit | LocalLLaMA | View on Reddit | 42 comments
-
Apple “will seriously consider” buying Mistral | Bloomberg - Mark Gurman
Posted by Nunki08@reddit | LocalLLaMA | View on Reddit | 200 comments
-
I want to hire 100k programmers and create the first tech giant startup
Posted by zeeza48@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Meta’s New Superintelligence Lab Is Discussing Major A.I. Strategy Changes
Posted by showmeufos@reddit | LocalLLaMA | View on Reddit | 13 comments
-
I ditch all LLM framework and use only OpenAI SDK for everything, I start loving building AI application this way.
Posted by dheetoo@reddit | LocalLLaMA | View on Reddit | 15 comments
-
Why do base models give gibberish and need further 'fine tuning'
Posted by QFGTrialByFire@reddit | LocalLLaMA | View on Reddit | 14 comments
-
Is real-time voice-to-voice still science fiction?
Posted by junior600@reddit | LocalLLaMA | View on Reddit | 11 comments
-
Kimi K2 1.8bit Unsloth Dynamic GGUFs
Posted by danielhanchen@reddit | LocalLLaMA | View on Reddit | 32 comments
-
Ollama, Why No Reka Flash, SmolLM3, GLM-4?
Posted by chibop1@reddit | LocalLLaMA | View on Reddit | 14 comments
-
Do you think an AI will achieve gold medal in 2025 International Math Olympad (tomorrow)
Posted by mathsTeacher82@reddit | LocalLLaMA | View on Reddit | 30 comments
-
After Kimi K2 Is Released: No Longer Just a ChatBot
Posted by nekofneko@reddit | LocalLLaMA | View on Reddit | 31 comments
-
Comparison of latest reasoning models on the most recent LeetCode questions (Qwen-32B vs Qwen-235B vs nvidia-OpenCodeReasoning-32B vs Hunyuan-A13B)
Posted by kyazoglu@reddit | LocalLLaMA | View on Reddit | 26 comments
-
If you limit context to 4k tokens, which models today beat Llama2-70B from 2 years ago?
Posted by EmPips@reddit | LocalLLaMA | View on Reddit | 10 comments
-
A mid range PC build for Dual GPU Local LLMs and SLMs.
Posted by iammhk@reddit | LocalLLaMA | View on Reddit | 5 comments
-
Can VRAM be combined of 2 brands
Posted by tonyleungnl@reddit | LocalLLaMA | View on Reddit | 65 comments
-
Is the output of only the shared expert(s) in a MOE model coherent?
Posted by gofiend@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Benchmarking Qwen3 30B and 235B on dual RTX PRO 6000 Blackwell Workstation Edition
Posted by blackwell_tart@reddit | LocalLLaMA | View on Reddit | 45 comments
-
UTCP: A safer, scalable tool-calling alternative to MCP
Posted by juanviera23@reddit | LocalLLaMA | View on Reddit | 88 comments
-
Major Hugging Face announcement on July 24th
Posted by LightEt3rnaL@reddit | LocalLLaMA | View on Reddit | 20 comments
-
Best LLM for Educators ?
Posted by Creative_Structure22@reddit | LocalLLaMA | View on Reddit | 5 comments
-
Real-time conversational AI running 100% locally in-browser on WebGPU
Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 132 comments
-
IndexTTS2, the most realistic and expressive text-to-speech model so far, has leaked their demos ahead of the official launch! And... wow!
Posted by pilkyton@reddit | LocalLLaMA | View on Reddit | 130 comments
-
Should I buy Tesla K80 for 70€ or Tesla M10 for 110€?
Posted by Similar-Republic149@reddit | LocalLLaMA | View on Reddit | 21 comments
-
Recorded a userflow for my vibecoding pet project - character selection, model setup, inline replies, and image generation
Posted by RIPT1D3_Z@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Multiple 5060 Ti's
Posted by snorixx@reddit | LocalLLaMA | View on Reddit | 28 comments
-
Friendly reminder that Grok 3 should be now open-sourced
Posted by Wrong_User_Logged@reddit | LocalLLaMA | View on Reddit | 194 comments
-
How to improve response times for multimodal requests?
Posted by coolahavoc@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Where local is lagging behind... Wish lists for the rest of 2025
Posted by nomorebuttsplz@reddit | LocalLLaMA | View on Reddit | 26 comments
-
Poor mans x79 motherboard ETH79-X5
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 19 comments
-
Training an LLM only on books from the 1800's - no modern bias
Posted by Remarkable-Trick-177@reddit | LocalLLaMA | View on Reddit | 168 comments
-
How to SFT diffusion large language model ?
Posted by ProfessionalGuess884@reddit | LocalLLaMA | View on Reddit | 4 comments
-
How I use Gemma 3 to help me reply my texts
Posted by sean01-eth@reddit | LocalLLaMA | View on Reddit | 28 comments
-
Kimi-K2 is a DeepSeek V3 with more experts
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 32 comments
-
Runpod, Hugging Face, or what for super-simple uncensored LLM-in-the-cloud setup?
Posted by goldenapple212@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Diffusion model support in llama.cpp.
Posted by fallingdowndizzyvr@reddit | LocalLLaMA | View on Reddit | 12 comments
-
we have to delay it
Posted by ILoveMy2Balls@reddit | LocalLLaMA | View on Reddit | 210 comments
-
Annoyed with LibreChat
Posted by Charming_Support726@reddit | LocalLLaMA | View on Reddit | 11 comments
-
Ollama retaining history?
Posted by DimensionEnergy@reddit | LocalLLaMA | View on Reddit | 10 comments
-
What hardware is needed for 10 developers running Devsral in parallel mode?
Posted by 3dom@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Stanford's CS336 2025 (Language Modeling from Scratch) is now available on YouTube
Posted by realmvp77@reddit | LocalLLaMA | View on Reddit | 26 comments
-
SmolLM3 has day-0 support in MistralRS!
Posted by EricBuehler@reddit | LocalLLaMA | View on Reddit | 5 comments
-
Kimi K2 is funny and great
Posted by theskilled42@reddit | LocalLLaMA | View on Reddit | 67 comments
-
llama.cpp k-quants
Posted by thomas999999@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Synthetic nonsense data improves llama.cpp Quantization accuracy
Posted by kindacognizant@reddit | LocalLLaMA | View on Reddit | 1 comments
-
GGUFs quants can punch above their weights now
Posted by Chromix_@reddit | LocalLLaMA | View on Reddit | 1 comments
-
There is no proper explanation of GGUF quantization methods
Posted by Free_Significance267@reddit | LocalLLaMA | View on Reddit | 14 comments
-
Overview of GGUF quantization methods
Posted by he29@reddit | LocalLLaMA | View on Reddit | 12 comments
-
LM Studio cant use my gpu as main
Posted by Zinxdia@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Any Actual alternative to gpt-4o or claude?
Posted by Dragonacious@reddit | LocalLLaMA | View on Reddit | 43 comments