QwQ-32B seems to get the same quality final answer as R1 while reasoning much more concisely and efficiently

Posted by pigeon57434@reddit | LocalLLaMA | View on Reddit | 129 comments

I think I will now switch over to using QwQ as my primary reasoning model instead of R1. In all my testing, it gets the same or superior quality answers as R1 does, while having its chain of thought be much more efficient, much more concise, and much more confident. In contrast, R1 feels like a bumbling idiot who happens to be really smart only because he tries every possible solution. Adam was right when he said not all CoTs are equal, and in this case, I think Qwen trained their model to be more efficient without degrading quality at all. But I'm curious to hear what everyone here thinks, because I'm sure others are more experienced than I am.