-
[Ministral 3] Add ministral 3 - Pull Request #42498 · huggingface/transformers
Posted by bratao@reddit | LocalLLaMA | View on Reddit | 27 comments
-
Raw Chain-of-Thought from Gemini 3 Pro. It hallucinates, corrects itself, and eventually crashes.
Posted by Numerous-Campaign844@reddit | LocalLLaMA | View on Reddit | 9 comments
-
$900 for 192GB RAM on Oct 23rd, now costs over $3k
Posted by Hoppss@reddit | LocalLLaMA | View on Reddit | 201 comments
-
What direction do you think the enshittification (platform decay) of LLM services is likely to take?
Posted by ThatOneGuy4321@reddit | LocalLLaMA | View on Reddit | 35 comments
-
Looking for a cheaper GPU platform for multi modal AI work
Posted by AgentSad427@reddit | LocalLLaMA | View on Reddit | 7 comments
-
Any idea when RAM prices will be “normal”again?
Posted by Porespellar@reddit | LocalLLaMA | View on Reddit | 267 comments
-
Winter LLM
Posted by aziham@reddit | LocalLLaMA | View on Reddit | 9 comments
-
Optimising NVIDIA’s DGX Spark (Grace + Blackwell) – 1.5× PyTorch speedup with custom build
Posted by guigsss@reddit | LocalLLaMA | View on Reddit | 30 comments
-
Upcoming vllm Mistral Large 3 support
Posted by brown2green@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Gemini 3 API Tutorial: Automating Data Analysis With Gemini 3 Pro and LangGraph
Posted by kingabzpro@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Microsoft silently releases OmniParser, a tool to convert screenshots into structured and easy-to-understand elements for Vision Agents
Posted by umarmnaq@reddit | LocalLLaMA | View on Reddit | 84 comments
-
Why most models on Hugging Face cannot be ran on Ollama ?
Posted by KaKi_87@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Ollamacode - Local AI assistant that can create, run and understand your codebase.
Posted by Loud-Consideration-2@reddit | LocalLLaMA | View on Reddit | 11 comments
-
AMD 395+ and NVIDIA GPU
Posted by EntropyNegotiator@reddit | LocalLLaMA | View on Reddit | 18 comments
-
Announcing Llamazing: Your Ollama and ComfyUI server on IOS!
Posted by mandrak4@reddit | LocalLLaMA | View on Reddit | 6 comments
-
Gemma3 27 heretic, lower divergence than mlabonne/gemma3
Posted by coder3101@reddit | LocalLLaMA | View on Reddit | 18 comments
-
Why it's getting worse for everyone: The recent influx of AI psychosis posts and "Stop LARPing"
Posted by Chromix_@reddit | LocalLLaMA | View on Reddit | 144 comments
-
Introducing Codex Kaioken – the Codex CLI fork with subagents, plan mode UX, indexing and manual checkpoints and restoring.
Posted by No-Point1424@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Trained a chess LLM locally that beats GPT-5 (technically)
Posted by KingGongzilla@reddit | LocalLLaMA | View on Reddit | 54 comments
-
Sentiment Analysis Model Guidance
Posted by Reno911-07078@reddit | LocalLLaMA | View on Reddit | 8 comments
-
(Partly) Open Video Overview – Generate narrated videos from text with AI (requires Gemini API)
Posted by arbayi@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Recommendations for summarization and structured data extraction
Posted by cachophonic@reddit | LocalLLaMA | View on Reddit | 11 comments
-
Build a local AI server with backup
Posted by carcaliguy@reddit | LocalLLaMA | View on Reddit | 2 comments
-
What’s your biggest challenge when working with AI workflows or agents?
Posted by Thin-Factor-6457@reddit | LocalLLaMA | View on Reddit | 0 comments
-
I mapped how language models decide when a pile of sand becomes a “heap”
Posted by Specialist_Bad_4465@reddit | LocalLLaMA | View on Reddit | 37 comments
-
Yet another reason to stick with local models
Posted by nekofneko@reddit | LocalLLaMA | View on Reddit | 87 comments
-
Nvidia cards using too much VRAM?
Posted by Maxumilian@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Optimizing Token Generation in llama.cpp's CUDA Backend
Posted by am17an@reddit | LocalLLaMA | View on Reddit | 29 comments
-
ArliAI/gpt-oss-120b-Derestricted · Hugging Face
Posted by Arli_AI@reddit | LocalLLaMA | View on Reddit | 54 comments
-
Pavel Durov introduces Cocoon, a decentralized AI inference plateform on TON
Posted by No_Palpitation7740@reddit | LocalLLaMA | View on Reddit | 12 comments
-
DGX Spark for $2,899
Posted by TokenRingAI@reddit | LocalLLaMA | View on Reddit | 6 comments
-
Why the Strix Halo is a poor purchase for most people
Posted by NeverEnPassant@reddit | LocalLLaMA | View on Reddit | 323 comments
-
Looking for High-Quality Open-Source Local TTS That’s Faster Than IndexTTS2
Posted by TomNaughtyy@reddit | LocalLLaMA | View on Reddit | 11 comments
-
Is there any free AI website that i can feed my pictures or pdf file and it generates csv flashcards file based on that?
Posted by FatFigFresh@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Is there music AI without Python?
Posted by iwakawa2173@reddit | LocalLLaMA | View on Reddit | 25 comments
-
More of Silicon Valley is building on free Chinese AI
Posted by buppermint@reddit | LocalLLaMA | View on Reddit | 27 comments
-
How is everyone doing DPO on Gemma 3 using Unsloth/TRL?
Posted by CartographerFun4221@reddit | LocalLLaMA | View on Reddit | 5 comments
-
I spent 2 years building privacy-first local AI. My conclusion: Ingestion is the bottleneck, not the Model. (Showcase: Ollama + Docling RAG Kit)
Posted by ChapterEquivalent188@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Supertonic WebGPU: blazingly fast text-to-speech running 100% locally in your browser.
Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Anyone else using small “prompt modules” with local models? Here are a few I’ve been testing.
Posted by Professional-Rest138@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Heretic: Fully automatic censorship removal for language models
Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 306 comments
-
What’s your biggest headache when running autonomous agents locally?
Posted by Substantial_Step_351@reddit | LocalLLaMA | View on Reddit | 9 comments
-
What’s the biggest headache you’ve run into with autonomous agents so far?
Posted by AgentAiLeader@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Questions about parameter size & quantization
Posted by LeastExperience1579@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Renting Out DGX Spark
Posted by jsfour@reddit | LocalLLaMA | View on Reddit | 4 comments
-
If I want to use a small model to "decode" scanned pdf with graphs and tables etc to feed it to a large non multimodal model. What is my best option?
Posted by Windowsideplant@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Insert pauses into text file for kokoro
Posted by dts-five@reddit | LocalLLaMA | View on Reddit | 25 comments
-
Anyone fine-tuned facebookresearch/omnilingual-asr? Looking for guidance or codebase
Posted by Outside_Solid5371@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Deepseek new version model now in their website!
Posted by Famous-Associate-436@reddit | LocalLLaMA | View on Reddit | 3 comments
-
GPU integrated to laptop - Mistake?
Posted by Virtual_Attitude2025@reddit | LocalLLaMA | View on Reddit | 5 comments