Fast_Thing_7949

Slower Means Faster: Why I Switched from Qwen3 Coder Next to Qwen3.5 122B

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 84 comments
Benchmark: ik_llama.cpp vs llama.cpp on Qwen3/3.5 MoE Models

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 20 comments
ik_llama.cpp vs llama.cpp performance flip-flop on Qwen3.5 MoE models - same params, different winners!

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 1 comments
Is Dual Gpu for large context and GGUF models good idea?

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 11 comments
I'm tired

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 13 comments
Qwen Code looping with Qwen3-Coder-Next / Qwen3.5-35B-A3B

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 7 comments
Talk me out of buying an RTX 3090 “just for local AI” (before I do something financially irresponsible)

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 53 comments
Dual rx 9070 for LLMs?

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 11 comments
What's the point of potato-tier LLMs?

Posted by Fast_Thing_7949@reddit | LocalLLaMA | View on Reddit | 239 comments