I tested Qwen 3 235b against Deepseek r1, Qwen did better on simple tasks but r1 beats in nuance

Posted by SunilKumarDash@reddit | LocalLLaMA | View on Reddit | 41 comments

I have been using Deepseek r1 for a while, mainly for writing, and I have tried the Qwq 32b, which was plenty impressive. But the new models are a huge upgrade, though I have yet to try the 30b model. The 235b model is really impressive for the cost and size. Definitely much better than Llama 4s. So, I compared the top 2 open-source models on coding, reasoning, math, and writing tasks. Here's what I found out. **1. Coding** For a lot of coding tasks, you wouldn't notice much difference. Both models perform on par, sometimes Qwen taking the lead. **2. Reasoning and Math** Deepseek leads here with more nuance in the thought process. Qwen is not bad at all, gets most of the work done, but takes longer to finish tasks. It gives off the vibe of overfit at times. **3. Writing** For creative writing, Deepseek r1 is still in the top league, right up there with closed models. For summarising and technical description, Qwen offers similar performance. For a full comparison check out this blog post: [Qwen 3 vs. Deepseek r1](https://composio.dev/blog/qwen-3-vs-deepseek-r1-complete-comparison/). It has been a great year so far for open-weight AI models, especially from Chinese labs. It would be interesting to see the next from Deepseek. Hope the Llama Behemoth turns out to be a better model. Would love to know your experience with the new Qwens, and would love to know which local Qwen is good for local use cases, I have been using Gemma 3.