TheaterFire

Login with Reddit

Currently browsing tags:

  • LocalLLaMA
  • Better quantization: Yet Another Quantization Algorithm

    Posted by tsengalb99@reddit | LocalLLaMA | View on Reddit | 34 comments

  • I built an app that turns your photos into smart packing lists — all on your iPhone, 100% private, no APIs, no data collection!

    Posted by w-zhong@reddit | LocalLLaMA | View on Reddit | 48 comments

  • what's the case against flash attention?

    Posted by Responsible-Crew1801@reddit | LocalLLaMA | View on Reddit | 23 comments

  • Is this the largest "No synthetic data" open weight LLM? (142B)

    Posted by AaronFeng47@reddit | LocalLLaMA | View on Reddit | 28 comments

  • Hot Take: Gemini 2.5 Pro Makes Too Many Assumptions About Your Code

    Posted by HideLord@reddit | LocalLLaMA | View on Reddit | 119 comments

  • Guys real question where llama 4 behemoth and thinking ??

    Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 18 comments

  • Sparse Transformers: Run 2x faster LLM with 30% lesser memory

    Posted by Economy-Mud-6626@reddit | LocalLLaMA | View on Reddit | 66 comments

  • Is there appetite for hosting 3b/8b size models at an affordable rate?

    Posted by No-Fig-8614@reddit | LocalLLaMA | View on Reddit | 19 comments

  • Do LLMs have opinions?

    Posted by WeAllFuckingFucked@reddit | LocalLLaMA | View on Reddit | 31 comments

  • Git for Idiots (Broken down to Four Commands)

    Posted by Consistent-Disk-7282@reddit | LocalLLaMA | View on Reddit | 11 comments

  • I built a platform that generates overviews of codebases and creates a map of the codebase dependencies

    Posted by ComfortableArm121@reddit | LocalLLaMA | View on Reddit | 0 comments

  • Stop over-engineering AI apps: just use Postgres

    Posted by Worldly_Expression43@reddit | LocalLLaMA | View on Reddit | 65 comments

  • AI server help, duel k80s LocalAGI

    Posted by JcorpTech@reddit | LocalLLaMA | View on Reddit | 7 comments

  • Pocketflow is now a workflow generator called Osly!! All you need to do is describe your idea

    Posted by Weak_Birthday2735@reddit | LocalLLaMA | View on Reddit | 0 comments

  • Hugging Face Just Dropped it's MCP Server

    Posted by eternviking@reddit | LocalLLaMA | View on Reddit | 9 comments

  • llama-server is cooking! gemma3 27b, 100K context, vision on one 24GB GPU.

    Posted by No-Statement-0001@reddit | LocalLLaMA | View on Reddit | 54 comments

  • Now I need to explain this to her...

    Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 515 comments

  • China is leading open source

    Posted by TheLogiqueViper@reddit | LocalLLaMA | View on Reddit | 292 comments

  • What are the top creative writing models ?

    Posted by TheArchivist314@reddit | LocalLLaMA | View on Reddit | 18 comments

  • Smallest llm that can help in text rearrangement

    Posted by Away_Expression_3713@reddit | LocalLLaMA | View on Reddit | 4 comments

  • After court order, OpenAI is now preserving all ChatGPT and API logs

    Posted by iGermanProd@reddit | LocalLLaMA | View on Reddit | 279 comments

  • MiniCPM4: 7x decoding speed than Qwen3-8B

    Posted by Lynncc6@reddit | LocalLLaMA | View on Reddit | 24 comments

  • Current best model for technical documentation text generation for RAG / fine tuning?

    Posted by OkAstronaut4911@reddit | LocalLLaMA | View on Reddit | 1 comments

  • MiniCPM4: Ultra-Efficient LLMs on End Devices

    Posted by adefa@reddit | LocalLLaMA | View on Reddit | 7 comments

  • Even DeepSeek switched from OpenAI to Google

    Posted by Utoko@reddit | LocalLLaMA | View on Reddit | 174 comments

  • So cool! Imagine if it was local. Any similar localLLM projects out there?

    Posted by Own-Potential-2308@reddit | LocalLLaMA | View on Reddit | 1 comments

  • New embedding model "Qwen3-Embedding-0.6B-GGUF" just dropped.

    Posted by Proto_Particle@reddit | LocalLLaMA | View on Reddit | 97 comments

  • MSI PC with NVIDIA GB10 Superchip - 6144 CUDA Cores and 128GB LPDDR5X Confirmed

    Posted by shakhizat@reddit | LocalLLaMA | View on Reddit | 62 comments

  • Help with Proxmox + Debian + Docker /w Nvidia 5060TI

    Posted by EarEquivalent3929@reddit | LocalLLaMA | View on Reddit | 12 comments

  • Is there an video or article or book where a lot of real world datasets are used to train industry level LLM with all the code?

    Posted by Happysedits@reddit | LocalLLaMA | View on Reddit | 14 comments

  • Build LLM from Scratch | Mega Playlist of 43 videos

    Posted by OtherRaisin3426@reddit | LocalLLaMA | View on Reddit | 10 comments

  • new Bielik models have been released

    Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 19 comments

  • AMA – I’ve built 7 commercial RAG projects. Got tired of copy-pasting boilerplate, so we open-sourced our internal stack.

    Posted by Loud_Picture_1877@reddit | LocalLLaMA | View on Reddit | 98 comments

  • How does gemma3:4b-it-qat fare against OpenAI models on MMLU-Pro benchmark? Try for yourself in Excel

    Posted by Kapperfar@reddit | LocalLLaMA | View on Reddit | 21 comments

  • It is possble to run non-reasoning deepseek-r1-0528?

    Posted by relmny@reddit | LocalLLaMA | View on Reddit | 21 comments

  • Initial thoughts on Google Jules

    Posted by maaakks@reddit | LocalLLaMA | View on Reddit | 59 comments

  • China's Xiaohongshu(Rednote) released its dots.llm open source AI model

    Posted by Fun-Doctor6855@reddit | LocalLLaMA | View on Reddit | 124 comments

  • Real-time conversation with a character on your local machine

    Posted by ResolveAmbitious9572@reddit | LocalLLaMA | View on Reddit | 33 comments

  • Terrible hindi translation, missing texts, paused timeline whisper ?

    Posted by jadhavsaurabh@reddit | LocalLLaMA | View on Reddit | 1 comments

  • I created a totally free and local subtitle generator and renderer that works in browser!

    Posted by Qunit-Essential@reddit | LocalLLaMA | View on Reddit | 53 comments

  • What is the best value card I could buy for decent performance?

    Posted by equinoxel@reddit | LocalLLaMA | View on Reddit | 6 comments

  • The new king? M3 Ultra, 80 Core GPU, 512GB Memory

    Posted by Hanthunius@reddit | LocalLLaMA | View on Reddit | 294 comments

  • Can a model be so radically altered that its origin can no longer be recognized? YES!

    Posted by Sicarius_The_First@reddit | LocalLLaMA | View on Reddit | 29 comments

  • Multi modality is currently terrible in open source

    Posted by Unusual_Guidance2095@reddit | LocalLLaMA | View on Reddit | 28 comments

  • Real-time conversational AI running 100% locally in-browser on WebGPU

    Posted by xenovatech@reddit | LocalLLaMA | View on Reddit | 121 comments

  • How Fast can I run models.

    Posted by feelin-lonely-1254@reddit | LocalLLaMA | View on Reddit | 3 comments

  • 3b and 7b Serving with new Hardware

    Posted by No-Fig-8614@reddit | LocalLLaMA | View on Reddit | 4 comments

  • New model - Qwen3 Embedding + Reranker

    Posted by koc_Z3@reddit | LocalLLaMA | View on Reddit | 1 comments

  • Cannot even run the smallest model on system RAM?

    Posted by FloJak2004@reddit | LocalLLaMA | View on Reddit | 21 comments

  • Semantic routing and caching doesn't work - task specific LLMs (TLMs) ftw!

    Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 9 comments

Next