TheaterFire

Login with Reddit

Currently browsing tags:

  • LocalLLaMA
  • Heretic: Fully automatic censorship removal for language models

    Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 271 comments

  • Gemini 3 is launched

    Posted by Several-Republic-609@reddit | LocalLLaMA | View on Reddit | 220 comments

  • 🚀 NVIDIA DGX Spark vs. Alternatives: Escaping the RTX 3060 (6GB) for Medical LLM Research

    Posted by Muted-Examination278@reddit | LocalLLaMA | View on Reddit | 8 comments

  • Our AI assistant keeps getting jailbroken and it’s becoming a security nightmare

    Posted by Comfortable_Clue5430@reddit | LocalLLaMA | View on Reddit | 7 comments

  • Mistral removing ton of old models from API (preparing for a new launch?)

    Posted by mpasila@reddit | LocalLLaMA | View on Reddit | 20 comments

  • Is qwen3vl 235B is supposed to be this slow?

    Posted by shapic@reddit | LocalLLaMA | View on Reddit | 11 comments

  • are you seeking final end of life

    Posted by pani343@reddit | LocalLLaMA | View on Reddit | 1 comments

  • ollama's enshitification has begun! open-source is not their priority anymore, because they're YC-backed and must become profitable for VCs... Meanwhile llama.cpp remains free, open-source, and easier-than-ever to run! No more ollama

    Posted by nderstand2grow@reddit | LocalLLaMA | View on Reddit | 183 comments

  • Large language models show signs of introspection

    Posted by bigzyg33k@reddit | LocalLLaMA | View on Reddit | 10 comments

  • Gemini 3 Pro vs Kimi K2 Thinking

    Posted by SlowFail2433@reddit | LocalLLaMA | View on Reddit | 58 comments

  • MMaDA-Parallel: Parallel Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

    Posted by nnxnnx@reddit | LocalLLaMA | View on Reddit | 1 comments

  • Make your AI talk like a caveman and decrease token usage

    Posted by RegionCareful7282@reddit | LocalLLaMA | View on Reddit | 117 comments

  • How to transcribe a video and summarize it?

    Posted by bullerwins@reddit | LocalLLaMA | View on Reddit | 29 comments

  • How do I actually use the Ryzen AI chip… for anything?

    Posted by Muted_Head_1636@reddit | LocalLLaMA | View on Reddit | 1 comments

  • GLM 4.6 on 128 GB RAM with llama.cpp

    Posted by ilintar@reddit | LocalLLaMA | View on Reddit | 18 comments

  • Epstein emails graph relationship extraction and visualizer

    Posted by madmax_br5@reddit | LocalLLaMA | View on Reddit | 5 comments

  • I built an "Antigravity Workspace" for Gemini 3. It forces the AI to follow "Artifact-First" protocols automatically. (Open Source Template)

    Posted by Direct-Employ-3290@reddit | LocalLLaMA | View on Reddit | 0 comments

  • Nvidia Parakeet-Realtime-EOU-120m-v1

    Posted by nuclearbananana@reddit | LocalLLaMA | View on Reddit | 10 comments

  • Anyone on arm?

    Posted by No_Afternoon_4260@reddit | LocalLLaMA | View on Reddit | 0 comments

  • I replicated Anthropic’s "Introspection" paper on DeepSeek-7B. It works.

    Posted by Specialist_Bad_4465@reddit | LocalLLaMA | View on Reddit | 22 comments

  • Cerebras REAPs: MiniMax-M2 (25, 30, 40%), Kimi-Linear 30%, more on the way!

    Posted by ilzrvch@reddit | LocalLLaMA | View on Reddit | 22 comments

  • If the bubble bursts, what's gonna happen to all those chips?

    Posted by freecodeio@reddit | LocalLLaMA | View on Reddit | 156 comments

  • [D] What's the one thing you wish you'd known before putting an LLM app in production?

    Posted by Bbamf10@reddit | LocalLLaMA | View on Reddit | 4 comments

  • The world’s fastest open-source TTS: Supertonic

    Posted by ANLGBOY@reddit | LocalLLaMA | View on Reddit | 31 comments

  • What Size of LLM Can 4x RTX 5090 Handle? (96GB VRAM)

    Posted by Affectionate_Arm725@reddit | LocalLLaMA | View on Reddit | 14 comments

  • Any H200 owners out there?

    Posted by itsthewolfe@reddit | LocalLLaMA | View on Reddit | 5 comments

  • 🦙💥 Building llama.cpp with Vulkan backend on Android (Termux ARM64)

    Posted by Brahmadeo@reddit | LocalLLaMA | View on Reddit | 20 comments

  • Kimi K2 is the best clock AI

    Posted by InternationalAsk1490@reddit | LocalLLaMA | View on Reddit | 82 comments

  • Blogs to Follow

    Posted by TopNo6605@reddit | LocalLLaMA | View on Reddit | 1 comments

  • Do you think Gemini 3 uses MoR or Titans?

    Posted by SrijSriv211@reddit | LocalLLaMA | View on Reddit | 4 comments

  • What are the most unique models that are under 15b you encountered

    Posted by ResponsibleTruck4717@reddit | LocalLLaMA | View on Reddit | 7 comments

  • Most people in this LocalLLaMA are hypocritical.

    Posted by Ok_houlin@reddit | LocalLLaMA | View on Reddit | 31 comments

  • We just Fine-Tuned a Japanese Manga OCR Model with PaddleOCR-VL!

    Posted by erinr1122@reddit | LocalLLaMA | View on Reddit | 22 comments

  • How likely do you think a Ashley-Madison style widespread breach exposing users and conversations is in the next few years?

    Posted by Antique-Account-2359@reddit | LocalLLaMA | View on Reddit | 38 comments

  • Google Antigravity is a cursor clone

    Posted by Terminator857@reddit | LocalLLaMA | View on Reddit | 118 comments

  • RAM prices exploding should I grab old stock now for rag?

    Posted by Working_Opposite4167@reddit | LocalLLaMA | View on Reddit | 11 comments

  • vLLM 0.11.1 Seems to Be Bringing Massive Speedup on Turing GPUs

    Posted by lly0571@reddit | LocalLLaMA | View on Reddit | 1 comments

  • Running the latest LLMs like Granite-4.0 and Qwen3 fully on ANE (Apple NPU)

    Posted by Different-Effect-724@reddit | LocalLLaMA | View on Reddit | 10 comments

  • [Advice needed] Foreign language extraction using Qwen

    Posted by Ok_Television_9000@reddit | LocalLLaMA | View on Reddit | 7 comments

  • Transitioning My Entire AI/LLM Workflow to 100% Solar Power

    Posted by vesudeva@reddit | LocalLLaMA | View on Reddit | 40 comments

  • most hackable coding agent

    Posted by mnze_brngo_7325@reddit | LocalLLaMA | View on Reddit | 9 comments

  • Sanity check for a Threadripper + Dual RTX 6000 Ada node (Weather Forecasting / Deep Learning)

    Posted by Icy_Gas8807@reddit | LocalLLaMA | View on Reddit | 7 comments

  • Baguettotron, a 321 million parameters generalist Small Reasoning Model (80-layers deep)

    Posted by Balance-@reddit | LocalLLaMA | View on Reddit | 26 comments

  • Issues with michaelf34/infinity:latest-cpu + Qwen3-Embedding-8B

    Posted by Patentsmatter@reddit | LocalLLaMA | View on Reddit | 2 comments

  • is 2 nvlink x p6000 24gb (48 gb total) a bad choice for LLM's?

    Posted by Infamous_Charge2666@reddit | LocalLLaMA | View on Reddit | 11 comments

  • RTX 3080 20GB - A comprehensive review of Chinese card

    Posted by No-Refrigerator-1672@reddit | LocalLLaMA | View on Reddit | 16 comments

  • Building real-time speech translation (VAD→ASR→MT→TTS) - struggling with latency

    Posted by Big_Fix_7606@reddit | LocalLLaMA | View on Reddit | 0 comments

  • Offline Epstein File Ranker Using GPT-OSS-120B (Built on tensonaut’s dataset)

    Posted by onil_gova@reddit | LocalLLaMA | View on Reddit | 13 comments

  • Model quota limit exceeded with 1 prompt Google Antigravity

    Posted by ComposerGen@reddit | LocalLLaMA | View on Reddit | 16 comments

  • Got free passes for a big Virtual GenAI summit (OpenAI, Google, Microsoft, LangChain etc.)

    Posted by alimhabidi@reddit | LocalLLaMA | View on Reddit | 3 comments

Next