YearZero

Non-technical guide to run Qwen3 without reasoning using Llama.cpp server (without needing /no_think)

Posted by YearZero@reddit | LocalLLaMA | View on Reddit | 0 comments
How to rent a virtual Windows machine with RTX 4090?

Posted by YearZero@reddit | LocalLLaMA | View on Reddit | 3 comments
All Model Leaderboards (that I know)

Posted by YearZero@reddit | LocalLLaMA | View on Reddit | 1 comments
Help using Qwen2.5 0.5b-sized draft models with QwQ in Koboldcpp. Vocab size mismatch!

Posted by YearZero@reddit | LocalLLaMA | View on Reddit | 5 comments
Qwen 2.5 Coder 14b is worse than 7b on several benchmarks in the technical report - weird!

Posted by YearZero@reddit | LocalLLaMA | View on Reddit | 23 comments