lordekeen
How much VRAM needed for Qwen 3.6 27B Q8 with 262K context?
Posted by My_Unbiased_Opinion@reddit | LocalLLaMA | View on Reddit | 79 comments
Ignoring benchmarks, how do the newest local models (gemma 4 31B, 26BA4B, Qwen 3.6) “feel” to you? What do you think they compare to?
Posted by opoot_@reddit | LocalLLaMA | View on Reddit | 42 comments
lordekeen@reddit
What's this sub geebral opinion on quantisizing the KV cache
Posted by misanthrophiccunt@reddit | LocalLLaMA | View on Reddit | 91 comments
lordekeen@reddit
Step 3.7 Flash passes the car wash test
Posted by tarruda@reddit | LocalLLaMA | View on Reddit | 45 comments
lordekeen@reddit
Step 3.7 Flash passes the car wash test
Posted by tarruda@reddit | LocalLLaMA | View on Reddit | 45 comments
lordekeen@reddit
Step 3.7 Flash passes the car wash test
Posted by tarruda@reddit | LocalLLaMA | View on Reddit | 45 comments
lordekeen@reddit
KV cache quant benchmarks: q5 & q6 are underrated, q8/q4 is bad, TCQ has a niche
Posted by Anbeeld@reddit | LocalLLaMA | View on Reddit | 71 comments
lordekeen@reddit
Q4_K_M is fine for chat and a trap for agents. Here is math mathing.
Posted by Napster3301@reddit | LocalLLaMA | View on Reddit | 55 comments
lordekeen@reddit
What would 2x RTX 3060 12GB get me?
Posted by ObjectiveActuator8@reddit | LocalLLaMA | View on Reddit | 64 comments
lordekeen@reddit
What would 2x RTX 3060 12GB get me?
Posted by ObjectiveActuator8@reddit | LocalLLaMA | View on Reddit | 64 comments
lordekeen@reddit
RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help
Posted by gaztrab@reddit | LocalLLaMA | View on Reddit | 92 comments
lordekeen@reddit
Dual GPU llama.cpp speedup
Posted by Legitimate-Dog5690@reddit | LocalLLaMA | View on Reddit | 51 comments
lordekeen@reddit
MTP support merged into llama.cpp
Posted by tacticaltweaker@reddit | LocalLLaMA | View on Reddit | 108 comments
lordekeen@reddit
New Linux user, need help compiling llamacpp
Posted by Spiderboyz1@reddit | LocalLLaMA | View on Reddit | 32 comments
lordekeen@reddit
Is using vLLM actually worth it if you aren't serving the model to other people?
Posted by ayylmaonade@reddit | LocalLLaMA | View on Reddit | 98 comments
lordekeen@reddit
The Qwen 3.6 35B A3B hype is real!!!
Posted by The_Paradoxy@reddit | LocalLLaMA | View on Reddit | 149 comments
lordekeen@reddit
"Hardware is the only moat" - Should we buy new hardware now or wait?
Posted by Alan_Silva_TI@reddit | LocalLLaMA | View on Reddit | 177 comments
lordekeen@reddit
What's that one god damn app you need but won't work on Linux for no reason
Posted by Rough-Pen8792@reddit | linux | View on Reddit | 502 comments
lordekeen@reddit
Which distro are you using?
Posted by ukm_array@reddit | linux | View on Reddit | 778 comments