Deepseek-r1 thinks for 30 minutes?

Posted by XEUIPR@reddit | LocalLLaMA | View on Reddit | 8 comments

I was trying to ask a question about coding using DeepSeek-R1-0528-Qwen3-8B-Q4_K_M, and the thinking took 30 minutes???

I had to manually stop if because it just kept going.
Is there any way to mitigate this to generate only like \~2 minute thinking?
using lm studio

[-]

ZenK_J@reddit

This model is not DeepSeek R1, and it performs worse than the original Qwen3-8B in most real-world workloads. Just stop using it and switch to Qwen3.5-9B.

[-]

quick question, im kind of low on vram(laptop) and ive ran a few tests on some different models
the best few (qwen) models are the following:
qwen3.5 9b - \~15.33 t/s
qwen3 8b - \~16.33 t/s
qwen2.5.1 coder 7b instruct - \~31.66 t/s
ive tried some other models(not qwen), but they are either too large, too slow, or cant handle the logic
im curious on what would be best for "complex" logic in java
im not doing any large projects by any means, maybe just a few hundred lines, though it may be complex logic
is it worth the extra wait for the larger models if they have better reasoning and logic, or is it not worth it?

[-]

ZenK_J@reddit

I would prefer Qwen3.5-9B. But honestly, just try them out yourself to find the sweet spot for your setup.

[-]

Several-Tax31@reddit

On local models, the thinking time depends on the hardware. My local models can think for an hour on hard math questions, nothing wrong with that.

This model is pretty old though.

[-]

PiaRedDragon@reddit

Basically the quant is broken and it is looping.

Quantization is not easy to get right, I do not use any unless they have proven to be decent. In fact now based on my own testing I only use the original lab version or the one quant team I have had great results with. I have even stop using unsloth, their quants just don't delivery on intelligence for me.

My recommendation is test a few different ones, runs some quick benchmarks, and stick to the one that nails it for you.

[-]

clockr@reddit

what quant team do you use?

[-]

PiaRedDragon@reddit

You got to do some testing to be honest.

I hate giving the name of the one I use cause I get accused of advertising them, this sub can be a bit toxic when you mention any good ones.

[-]

ForsookComparison@reddit

The 8B distill was more of a "here's what happens when we try it on llama 3 8b" than anything meant to be productive.