Borkato
-
Genuinely what do we do about the bot comments in this sub
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 102 comments
-
I know this isn’t technically an LLM but OmniVoice is FUCKING AMAZING.
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 96 comments
-
What frontend do you guys use?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 92 comments
-
Can someone help me understand MCP?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 44 comments
-
Let’s talk quants of Gemma and Qwen - 16 vs Q8 vs Q4 - any experiences?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 93 comments
-
48GB VRAM users, what are your daily drivers? Do you wish you had more VRAM? What would you run if you did?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 245 comments
-
I feel like if they made a local model focused specifically on RP it would be god tier even if tiny
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 27 comments
-
PSA: If you haven’t updated Llama.cpp for a couple of days and find MTP to not be performing well, update llamacpp.
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 34 comments
-
Hopes and dreams for Google IO tomorrow? 👀
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 36 comments
-
Help me upgrade for 3k
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 30 comments
-
Do we rely too much on huggingface? Do you think they’ll eventually regulate open source models? Is there any way to distribute them elsewhere?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 97 comments
-
I can’t believe I can say “ugh I don’t feel like fixing this function, it’s too complex” and I can literally just tell my computer to fix it for me. I didn’t understand what they meant by “people will start paying for intelligence” but now I do.
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 140 comments
-
RPers: how do the new Gemma and Qwen compare to the old 70B models?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 52 comments
-
Best coder harness that sees your dirs, edits code, etc from the terminal that works with local?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 23 comments
-
Is Gemma 4 26B-A4B worse than Qwen 3.5 35B-A3B with tool calls, even after all the fixes?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 35 comments
-
Is anyone else just blown away that this local LLMs are even possible?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 140 comments
-
Any advice for testing similar versions of the same model?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 5 comments
-
Qwen3.5: 122B-A10B at IQ1 or 27B at Q4?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 32 comments
-
Those of you running MoE coding models on 24-30GB, how long do you wait for a reply?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 35 comments
-
What will I gain going from 30GB VRAM to 48?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 9 comments
-
Is there a way to speed up prompt processing with some layers on CPU with qwen-3-coder-next or similar MoEs?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 46 comments
-
How is it pronounced?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 43 comments
-
Smartest model for 24-28GB vram?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 76 comments
-
What are the best places to get good prompts?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 37 comments
-
Is there a consensus as to which types of prompts work best for jailbreaking?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 18 comments
-
Is there any way to estimate tokens per second given VRAM and such? The calculators don’t have every model.
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 7 comments
-
If you had to pick just one model family’s finetunes for RP under 30B, which would you pick?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Can GLM-4.5-air run on a single 3090 (24gb vram) with 48gb ram at above 10t/s?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 40 comments
-
How do you test new models?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 27 comments
-
Does anyone have a description of the general model families and their strengths and weaknesses?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 20 comments
-
Are 24-50Bs finally caught up to 70Bs now?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 162 comments
-
What’s the smartest NON thinking model under 40B or so?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 14 comments
-
Is there a way to FINETUNE a TTS model LOCALLY to learn sound effects?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Is anyone talking verbally to their models and have them talking back through TTS?
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 17 comments