TheaterFire

Is it just me or is deepseek r1 overthinking

Posted by The_GSingh@reddit | LocalLLaMA | View on Reddit | 14 comments

Is it just me or is deepseek r1 overthinking
It thought for 115 seconds on a single math problem

Reply to Post

14 Comments

Fresh-Feedback1091@reddit

Can you share the base prompt or template for allowing thinking?
View on Reddit #46302471

human_advancement@reddit

Let it think 🤔
View on Reddit #46235531

_SourTable@reddit

i mean, it's better to overthink than underthink, lmao.
View on Reddit #46221545

Senior-Raspberry-929@reddit

it thought for 2minutes straight
View on Reddit #46226992

The_GSingh@reddit (OP)

Yea but it costs money per output token. If it thinks pages and pages of info it’s gonna cost u more through the api
View on Reddit #46222140

_SourTable@reddit

that is valid, but shouldn't be that big of an issue, it's already cheaper than competition. maybe this can be bypassed by clever prompting?
View on Reddit #46222282

The_GSingh@reddit (OP)

Yea it’s cheaper than the competition but Gemini is literally free rn through the api too. Also it makes you use a competitor like Claude. If Claude can one shot it as opposed to this model thinking a lot, it’s definitely the way to go.
View on Reddit #46222383

_SourTable@reddit

> Gemini is literally free rn through the api too for personal use that would be fine, but not professionally > If Claude can one shot it as opposed to this model thinking a lot keyword "if". can it, if the training data doesn't have solution/similar problems..?
View on Reddit #46222838

The_GSingh@reddit (OP)

All valid points. The most annoying part is probably waiting that long for an average user but should be easy to overlook for the performance
View on Reddit #46223974

Ayman__donia@reddit

I think it's a good thing.gemini thing just 8 second
View on Reddit #46221780

_SourTable@reddit

the more they "think" the better answer will be, so it should be "good thing".
View on Reddit #46223622

The_GSingh@reddit (OP)

It costs money to use it through the api. Thinking pages and pages of output tokens isn’t a good idea on ur wallet
View on Reddit #46222170

_SourTable@reddit

it doesn't in chat.deepseek.com, which let's be real, where the most users will be.
View on Reddit #46223165

DarKresnik@reddit

Smarter than me😅
View on Reddit #46221084