Fast_Thing_7949
-
Slower Means Faster: Why I Switched from Qwen3 Coder Next to Qwen3.5 122B
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 84 comments
-
Benchmark: ik_llama.cpp vs llama.cpp on Qwen3/3.5 MoE Models
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 20 comments
-
ik_llama.cpp vs llama.cpp performance flip-flop on Qwen3.5 MoE models - same params, different winners!
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Is Dual Gpu for large context and GGUF models good idea?
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 11 comments
-
I'm tired
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Qwen Code looping with Qwen3-Coder-Next / Qwen3.5-35B-A3B
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 7 comments
-
Talk me out of buying an RTX 3090 “just for local AI” (before I do something financially irresponsible)
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 53 comments
-
Dual rx 9070 for LLMs?
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 11 comments
-
What's the point of potato-tier LLMs?
Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 239 comments