-
Heretic: Fully automatic censorship removal for language models
Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 271 comments
-
Gemini 3 is launched
Posted by Several-Republic-609@reddit | LocalLLaMA | View on Reddit | 220 comments
-
🚀 NVIDIA DGX Spark vs. Alternatives: Escaping the RTX 3060 (6GB) for Medical LLM Research
Posted by Muted-Examination278@reddit | LocalLLaMA | View on Reddit | 8 comments
-
Our AI assistant keeps getting jailbroken and it’s becoming a security nightmare
Posted by Comfortable_Clue5430@reddit | LocalLLaMA | View on Reddit | 7 comments
-
Mistral removing ton of old models from API (preparing for a new launch?)
Posted by mpasila@reddit | LocalLLaMA | View on Reddit | 20 comments
-
Is qwen3vl 235B is supposed to be this slow?
Posted by shapic@reddit | LocalLLaMA | View on Reddit | 11 comments
-
are you seeking final end of life
Posted by pani343@reddit | LocalLLaMA | View on Reddit | 1 comments
-
ollama's enshitification has begun! open-source is not their priority anymore, because they're YC-backed and must become profitable for VCs... Meanwhile llama.cpp remains free, open-source, and easier-than-ever to run! No more ollama
Posted by nderstand2grow@reddit | LocalLLaMA | View on Reddit | 183 comments
-
Large language models show signs of introspection
Posted by bigzyg33k@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Gemini 3 Pro vs Kimi K2 Thinking
Posted by SlowFail2433@reddit | LocalLLaMA | View on Reddit | 58 comments
-
MMaDA-Parallel: Parallel Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation
Posted by nnxnnx@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Make your AI talk like a caveman and decrease token usage
Posted by RegionCareful7282@reddit | LocalLLaMA | View on Reddit | 117 comments
-
How to transcribe a video and summarize it?
Posted by bullerwins@reddit | LocalLLaMA | View on Reddit | 29 comments
-
How do I actually use the Ryzen AI chip… for anything?
Posted by Muted_Head_1636@reddit | LocalLLaMA | View on Reddit | 1 comments
-
GLM 4.6 on 128 GB RAM with llama.cpp
Posted by ilintar@reddit | LocalLLaMA | View on Reddit | 18 comments
-
Epstein emails graph relationship extraction and visualizer
Posted by madmax_br5@reddit | LocalLLaMA | View on Reddit | 5 comments
-
I built an "Antigravity Workspace" for Gemini 3. It forces the AI to follow "Artifact-First" protocols automatically. (Open Source Template)
Posted by Direct-Employ-3290@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Nvidia Parakeet-Realtime-EOU-120m-v1
Posted by nuclearbananana@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Anyone on arm?
Posted by No_Afternoon_4260@reddit | LocalLLaMA | View on Reddit | 0 comments
-
I replicated Anthropic’s "Introspection" paper on DeepSeek-7B. It works.
Posted by Specialist_Bad_4465@reddit | LocalLLaMA | View on Reddit | 22 comments
-
Cerebras REAPs: MiniMax-M2 (25, 30, 40%), Kimi-Linear 30%, more on the way!
Posted by ilzrvch@reddit | LocalLLaMA | View on Reddit | 22 comments
-
If the bubble bursts, what's gonna happen to all those chips?
Posted by freecodeio@reddit | LocalLLaMA | View on Reddit | 156 comments
-
[D] What's the one thing you wish you'd known before putting an LLM app in production?
Posted by Bbamf10@reddit | LocalLLaMA | View on Reddit | 4 comments
-
The world’s fastest open-source TTS: Supertonic
Posted by ANLGBOY@reddit | LocalLLaMA | View on Reddit | 31 comments
-
What Size of LLM Can 4x RTX 5090 Handle? (96GB VRAM)
Posted by Affectionate_Arm725@reddit | LocalLLaMA | View on Reddit | 14 comments
-
Any H200 owners out there?
Posted by itsthewolfe@reddit | LocalLLaMA | View on Reddit | 5 comments
-
🦙💥 Building llama.cpp with Vulkan backend on Android (Termux ARM64)
Posted by Brahmadeo@reddit | LocalLLaMA | View on Reddit | 20 comments
-
Kimi K2 is the best clock AI
Posted by InternationalAsk1490@reddit | LocalLLaMA | View on Reddit | 82 comments
-
Blogs to Follow
Posted by TopNo6605@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Do you think Gemini 3 uses MoR or Titans?
Posted by SrijSriv211@reddit | LocalLLaMA | View on Reddit | 4 comments
-
What are the most unique models that are under 15b you encountered
Posted by ResponsibleTruck4717@reddit | LocalLLaMA | View on Reddit | 7 comments
-
Most people in this LocalLLaMA are hypocritical.
Posted by Ok_houlin@reddit | LocalLLaMA | View on Reddit | 31 comments
-
We just Fine-Tuned a Japanese Manga OCR Model with PaddleOCR-VL!
Posted by erinr1122@reddit | LocalLLaMA | View on Reddit | 22 comments
-
How likely do you think a Ashley-Madison style widespread breach exposing users and conversations is in the next few years?
Posted by Antique-Account-2359@reddit | LocalLLaMA | View on Reddit | 38 comments
-
Google Antigravity is a cursor clone
Posted by Terminator857@reddit | LocalLLaMA | View on Reddit | 118 comments
-
RAM prices exploding should I grab old stock now for rag?
Posted by Working_Opposite4167@reddit | LocalLLaMA | View on Reddit | 11 comments
-
vLLM 0.11.1 Seems to Be Bringing Massive Speedup on Turing GPUs
Posted by lly0571@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Running the latest LLMs like Granite-4.0 and Qwen3 fully on ANE (Apple NPU)
Posted by Different-Effect-724@reddit | LocalLLaMA | View on Reddit | 10 comments
-
[Advice needed] Foreign language extraction using Qwen
Posted by Ok_Television_9000@reddit | LocalLLaMA | View on Reddit | 7 comments
-
Transitioning My Entire AI/LLM Workflow to 100% Solar Power
Posted by vesudeva@reddit | LocalLLaMA | View on Reddit | 40 comments
-
most hackable coding agent
Posted by mnze_brngo_7325@reddit | LocalLLaMA | View on Reddit | 9 comments
-
Sanity check for a Threadripper + Dual RTX 6000 Ada node (Weather Forecasting / Deep Learning)
Posted by Icy_Gas8807@reddit | LocalLLaMA | View on Reddit | 7 comments
-
Baguettotron, a 321 million parameters generalist Small Reasoning Model (80-layers deep)
Posted by Balance-@reddit | LocalLLaMA | View on Reddit | 26 comments
-
Issues with michaelf34/infinity:latest-cpu + Qwen3-Embedding-8B
Posted by Patentsmatter@reddit | LocalLLaMA | View on Reddit | 2 comments
-
is 2 nvlink x p6000 24gb (48 gb total) a bad choice for LLM's?
Posted by Infamous_Charge2666@reddit | LocalLLaMA | View on Reddit | 11 comments
-
RTX 3080 20GB - A comprehensive review of Chinese card
Posted by No-Refrigerator-1672@reddit | LocalLLaMA | View on Reddit | 16 comments
-
Building real-time speech translation (VAD→ASR→MT→TTS) - struggling with latency
Posted by Big_Fix_7606@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Offline Epstein File Ranker Using GPT-OSS-120B (Built on tensonaut’s dataset)
Posted by onil_gova@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Model quota limit exceeded with 1 prompt Google Antigravity
Posted by ComposerGen@reddit | LocalLLaMA | View on Reddit | 16 comments
-
Got free passes for a big Virtual GenAI summit (OpenAI, Google, Microsoft, LangChain etc.)
Posted by alimhabidi@reddit | LocalLLaMA | View on Reddit | 3 comments