Borkato

Genuinely what do we do about the bot comments in this sub

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 102 comments
I know this isn’t technically an LLM but OmniVoice is FUCKING AMAZING.

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 96 comments
What frontend do you guys use?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 92 comments
Can someone help me understand MCP?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 44 comments
Let’s talk quants of Gemma and Qwen - 16 vs Q8 vs Q4 - any experiences?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 93 comments
48GB VRAM users, what are your daily drivers? Do you wish you had more VRAM? What would you run if you did?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 245 comments
I feel like if they made a local model focused specifically on RP it would be god tier even if tiny

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 27 comments
PSA: If you haven’t updated Llama.cpp for a couple of days and find MTP to not be performing well, update llamacpp.

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 34 comments
Hopes and dreams for Google IO tomorrow? 👀

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 36 comments
Help me upgrade for 3k

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 30 comments
Do we rely too much on huggingface? Do you think they’ll eventually regulate open source models? Is there any way to distribute them elsewhere?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 97 comments
I can’t believe I can say “ugh I don’t feel like fixing this function, it’s too complex” and I can literally just tell my computer to fix it for me. I didn’t understand what they meant by “people will start paying for intelligence” but now I do.

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 140 comments
RPers: how do the new Gemma and Qwen compare to the old 70B models?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 52 comments
Best coder harness that sees your dirs, edits code, etc from the terminal that works with local?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 23 comments
Is Gemma 4 26B-A4B worse than Qwen 3.5 35B-A3B with tool calls, even after all the fixes?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 35 comments
Is anyone else just blown away that this local LLMs are even possible?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 140 comments
Any advice for testing similar versions of the same model?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 5 comments
Qwen3.5: 122B-A10B at IQ1 or 27B at Q4?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 32 comments
Those of you running MoE coding models on 24-30GB, how long do you wait for a reply?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 35 comments
What will I gain going from 30GB VRAM to 48?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 9 comments
Is there a way to speed up prompt processing with some layers on CPU with qwen-3-coder-next or similar MoEs?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 46 comments
How is it pronounced?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 43 comments
Smartest model for 24-28GB vram?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 76 comments
What are the best places to get good prompts?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 37 comments
Is there a consensus as to which types of prompts work best for jailbreaking?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 18 comments
Is there any way to estimate tokens per second given VRAM and such? The calculators don’t have every model.

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 7 comments
If you had to pick just one model family’s finetunes for RP under 30B, which would you pick?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 10 comments
Can GLM-4.5-air run on a single 3090 (24gb vram) with 48gb ram at above 10t/s?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 40 comments
How do you test new models?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 27 comments
Does anyone have a description of the general model families and their strengths and weaknesses?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 20 comments
Are 24-50Bs finally caught up to 70Bs now?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 162 comments
What’s the smartest NON thinking model under 40B or so?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 14 comments
Is there a way to FINETUNE a TTS model LOCALLY to learn sound effects?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 4 comments
Is anyone talking verbally to their models and have them talking back through TTS?

Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 17 comments