Is DeepSeek V3 overhyped?

[-]

Recoil42@reddit

The catch is cost. Deepseek offers maybe 75% of the performance as Sonnet but at a very small fraction of the cost. It was trained at a very small fraction of the cost, and asks users for a small fraction of the cost. That's why it's in a league of its own. I used Cline last night and maybe thirty-minutes of casual coding clocked me $1.50. Two hours of DeepSeek usage clocked me maybe $0.15. It's not even close.

Sonnet is better. Definitely, concretely better. It solves problems for me that leave DeepSeek spinning in circles. But the cost-efficiency of DeepSeek is a crazy eyebrow-raiser — it is cheap enough to be effectively used unmetered for most people.

These days I default to DeepSeek and only tag Sonnet into the ring when a problem is particularly difficult to solve. For writing boilerplate, doing basic lookup, and writing simply functions — DeepSeek is unmatched.

[-]

Icy_Butterscotch6661@reddit

Genuine question - how can it be cheaper if the model is 600+ Billion params? Wouldn't that require beefy gpus and thus be expensive to run?

[-]

forgetaboutit7878@reddit

cause they are using old NVDA chips H800, but that shows up in performance

[-]

Recoil42@reddit

If you don't know how something works, the best advice is to not make things up — you only contribute to internet misinformation.

[-]

forgetaboutit7878@reddit

please explain

I read it is slower

[-]

Recoil42@reddit

NVIDIA's H800 chips aren't 'old' — they're export-neutered versions of the same chips the US gets.
DeepSeek is low-cost because the training cost was low, and because the inference cost is low. The chip performance doesn't have anything to do with it.

[-]

forgetaboutit7878@reddit

is this because of cheap labor?

[-]

Recoil42@reddit

Is what because of cheap labour?

[-]

forgetaboutit7878@reddit

The recent tariffs announced by President Donald Trump are expected to have a significant impact on DeepSeek, a Chinese AI startup. Here are some key points:

Semiconductor Tariffs: Trump has signaled the potential imposition of tariffs on foreign-made semiconductor chips, including those produced by Taiwan Semiconductor Manufacturing Company (TSMC). This could affect DeepSeek's access to advanced chips, which are crucial for their AI models2.
Increased Costs: The tariffs could lead to higher costs for DeepSeek, as they may need to source components from alternative suppliers or pay higher prices for imported chips.
Competitive Pressure: The tariffs are part of a broader effort to boost US competitiveness in the tech industry. This could lead to increased competition for DeepSeek as US companies are encouraged to innovate and invest in AI technologies.

Overall, the tariffs could pose challenges for DeepSeek by increasing their costs and intensifying competition in the AI sector.

[-]

Recoil42@reddit

Why are you quoting ChatGPT results to me about Taiwan in regards to a discussion about cheap labour?

[-]

forgetaboutit7878@reddit

deepseek will now have to have deep pockets, there cost will go up, they got those chips cheap from NVDA, future ones will be very costly.

[-]

Recoil42@reddit

That's not how any of this works. The tariffs are on US acquisitions of Taiwanese chips, not of Chinese acquisitions. In theory it makes Chinese acquisition less expensive, not more expensive.

[-]

forgetaboutit7878@reddit

That the cost was so low, they used recent Grads in China and modified chips, that cost less.

Did NVDA upsell his chips to everyone that were overkill?

[-]

forgetaboutit7878@reddit

Thanks, I didn't understand that

[-]

BoJackHorseMan53@reddit

75% performance at 2% of the cost.

[-]

forgetaboutit7878@reddit

until things get more complex, then performance will be worse

[-]

BoJackHorseMan53@reddit

People find it better than Claude.

[-]

forgetaboutit7878@reddit

sometimes my friend

[-]

forgetaboutit7878@reddit

chat gpt is free

[-]

forgetaboutit7878@reddit

and when security becomes more complex and then step by step it will lag to much

[-]

BoJackHorseMan53@reddit

I'll switch to something else when that happens

[-]

jaMMint@reddit

If you dont value your time though. So it really is about cost of developer time here. You save 25% on developer time by spending $3/hour more in API usage..

[-]

BoJackHorseMan53@reddit

Cline can blow $20 in an hour using Claude. With Deepseek, it's 4¢

[-]

jaMMint@reddit

For me (aider and continue.dev) sonnet just works much better.
If Deepseek works for you, you should definitely use it for saved cost. Ultimately it depends on your usage, and switching for different tasks is always an option anyways.

[-]

Any_Pressure4251@reddit

Why would you use Cline with a paid API? when you have Cursor, WindSurf?

[-]

Background-Finish-49@reddit

Cursor and windsurf are inferior to cline.

[-]

killver@reddit

yeah, good joke

[-]

Background-Finish-49@reddit

Not at all a joke. When you learn how to use it you'll understand what I mean.

[-]

killver@reddit

sorry, but it is a joke, maybe if you compare it with basic Cursor and if a ton of manual stuff is needed to make it comparable, you can do the same in cursor even better

[-]

Background-Finish-49@reddit

Cursor sucks ass

[-]

Scientiat@reddit

Why though. Works pretty fine in my end, better than a few top choices and much better than most.

[-]

Bakedsoda@reddit

can you please describe which setup to get cline to perform better?

cline + deekseep v3 api starts good but loops into infinite due to the short contents maybe?

curious to which llm works best is it just claude api?

[-]

Background-Finish-49@reddit

Openrouter, cline rules and cline_docs. If you don't have a proper cline_docs workflow you're always going to run into loops even with sonnet.

I use the right llm for each job, o1 or 4o in the web browser for planning and troubleshooting along with 16x prompt for referencing my database with o1, sonnet and deepseek for coding.

You're looping because of poor prompting and lack of preparation.

[-]

Bakedsoda@reddit

nice with 3.2 act/plan does that improve the prompting and prep? have you tried it?

[-]

Background-Finish-49@reddit

I haven't had enough time to dig into it to be honest. I've been following my same workflow instead because it already works.

[-]

this-just_in@reddit

Desire to stay within the VSCode ecosystem

[-]

Any_Pressure4251@reddit

They are both VScode based, I actually run Cline and RooCline with Windsurf with the excellent Gemini models.
Gemini 1206 is better than Claude Sonnet at Flutter and Java Code...

[-]

Minimum-Ad-2683@reddit

Except some extensions dont work, I work a lot in web backends and postman works on neither of them. Quite a bummer

[-]

Recoil42@reddit

Neither Cursor nor Windsurf offer free unlimited requests.

[-]

Any_Pressure4251@reddit

Seems unlimited to me, they cost like $0.33 a day! Cline just likes to waste tokens.

[-]

Recoil42@reddit

Cursor only has 500 fast premium requests per month.

[-]

eloitay@reddit

Yeah I was drawn to the hype and once I start using I realized that it is not really that great. Simple stuff sure but more complicated stuff and cutting edge stuff either it is not aware of its existence, hallucinate very badly or plain make mistake all over the place.

[-]

playfuldreamz@reddit

"Simple stuff" and "cutting edge stuff" provide no context, care to be clear with examples?

[-]

RageshAntony@reddit

What are the uses of Cline ?

[-]

this-just_in@reddit

It’s an agentic LLM interface inside VSCode. It offers a chat experience with shortcuts to add snippets, files, urls to context. It can summarize things, create files, edit your files directly, run commands. You bring your own AI provider (wide range support + local options). My understanding is that the underlying implementation is no longer dependent on tool calling, rather a custom xml tag solution- meaning, almost any remote or local OpenAI-compatible provider will work.

[-]

RageshAntony@reddit

Wow. Seems great. In their repo they mentioned claude. So I thought it supports claude for agent operations.

If I use DeepSeek, do all features work as expected?

[-]

this-just_in@reddit

Claude Sonnet is the model they build Cline around, and will likely provide the best experience because the prompts were tuned for it, but it supports a wide range of providers. You can use DeepSeek with it successfully.

[-]

RageshAntony@reddit

Okay. What about the cost when using Claude? I read that it consumes a lot of tokens thereby increasing the bill.

[-]

Esmaro@reddit

What are the differences with aider? A more streamlined, "hands-off" experience?

[-]

this-just_in@reddit

They cover much the same ground. Cline is an in-VSCode experience, aider is a terminal experience. Aider has a lot more features and functionality. For the purposes of making code modifications, Cline and aider with a default setups and a good model (Sonnet, 4o, DeepSeek) will be very similar. Aider, configured with advanced features, can probably do better.

[-]

MasterpieceKitchen72@reddit

Do you have an example on what made DeepSeek spinning for ages? If I would like to test this, for which problem should i ask to be solved?

[-]

Orolol@reddit

I used Cline last night and maybe thirty-minutes of casual coding clocked me $1.50. Two hours of DeepSeek usage clocked me maybe $0.15. It's not even close.

This is why curosr is sick, only 20$ a month, even if you use it A LOT like me.

[-]

megadonkeyx@reddit

Agree, this is how I've been working but I can't see it lasting that long. Surely deepseek aren't even covering their electricity costs?

[-]

Terminator857@reddit

Any tips on workflow for using different models?

[-]

smosjos@reddit

I use aider's copy paste function. I have a Claude subscription. I use the copy context tool of aider, paste that in Claude, ask my question, get tips back from Claude. Paste that in aider and deepseek does the implementation. Using those 2 together keeps my costs down with great results. Yes it is a bit hacky, but better than using Cline as API costs for Claude sets you back very quickly.

[-]

Background-Finish-49@reddit

Small changes = deepseek Complicated changes = sonnet

[-]

Recoil42@reddit

I'm still developing a rhythm and a feel for it, so no specific advice. Basically though, when I know I need to do web scaffolding or a complicated refactor I'll switch to Claude. Then once Claude's generated the initial pass I'll do refinement, modifications, etc with DeepSeek.

[-]

frivolousfidget@reddit

I cant have models training on my input… so I can only compare sonnet with Deepseek on Fireworks. Sonnet ended up cheaper due to the input caching.

[-]

nananashi3@reddit

Someone needs to throw them a big boy budget.

[-]

No-Fig-8614@reddit

This is solely from a self hoster.

Kind of, right now if you self host or through a provider it isn't optimized that well. SGLang is much better optimized than vLLM is but its a big model requring a lot of memory and so if you don't use their service which they optimized the hell out of its not that great. Other OSS models are way further optimized for vLLM and SGlang....

On vLLM with 8xh200's it was getting like 50tk/s vs SGlang was gettting 150 but still not what you'd expect from that level of hardware.

Even at it's quant its still causing slowness.

[-]

West-Code4642@reddit

Agreed. The cost allows use cases other models do not

[-]

OracleGreyBeard@reddit

Great response, echoes my thoughts exactly

[-]

OracleGreyBeard@reddit

Sort of. People who say it’s as good as Sonnet are definitely sniffing whippets. It’s very clearly not as good. On the other hand, it’s nearly as good and vastly cheaper.

If you need the best answer to a small number of prompts, go with Sonnet. If you’re burning lots of tokens (as in Cline) go with DeepSeek.

[-]

Cultural-Tea-6857@reddit

in the free version i only get the server is busy so far. So for my usecase its not much useable. Epsecially analyzing pdfs doesnt work for me.

[-]

Odd-Environment-7193@reddit

No. It's not overhyped. Let me tell you why. It's free and open source. You are comparing apples and oranges.

I can code all day in deepseek and I never reach some limit locking me out of the tool.

It's dirt cheap. To the point where the cost is negligible for personal use, even through an API or the free chat.

It doesn't lecture me or refuse almost anything I throw at it. Wanna ask questions about hacking or scraping and you won't get some moral lecture about wasting many precious tokens... Need to edit some spicy comment that contains foul language - no problem.

I for one am happy to have moved away from Openai and Claude models with the different options available right now.

Claude is a SOTA model and it's considered the best coder by the majority of people who use it.

Deepseek is the best open-source model we've ever got.

While that might be subjective, it's the first opensource model I've used that's this impressive. It's accessible through their interface and has some cool features like thinking and web search... Goddamm awesome if you ask me.

Currently using Gemini 1206, and Gemini 2.0 exp flash and Deepseekv3 as the daily drivers. Claude and Openai taking a back seat in my current line up.

For reference, Fullstack engineer, 5 years.

Deepseek is a weapon for coding and has great properties for agentic tools. It just feels very modern aswell. It gives long as fuck replies, with all my code intact without changing things and trying to always add code comments and shortening responses.

It's also able to switch quickly between these long replies and short concise answers, something I have seen a lot of modern models struggle with. I don't like claude, because of this particular behavior.

It also has this great way of explaining things while it's doing them. Just a few short sentences usually ontop of a response or something. Which I really like, since even tools like gemini aren't as good in this (IMO).

It's also very good at step by step explanations. Lot's of wow moments for me using this tool.

Personally a huge fan.

[-]

Jesus359@reddit

When I read reply like yours I really wonder where & how they are running Deepseek.

I mean Gemini, GPT and Claude already need big infrastructure. I can only Imagine Chinas Gov having one whole building for Deepseek. One or two floors of scalable compute and the rest of the floors are engineers and scientists

[-]

Tarian_TeeOff@reddit

Probably because his reply was written *by* deepseek to shill it. Read it again, it's an outline of points with needlessly repetitive prose like "it just feels very modern aswell" and other excessive adjectives.

[-]

nicolas_06@reddit

https://huggingface.co/deepseek-ai/DeepSeek-V3

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

Basically their architecture means that at any moment while they need the memory of a 671B parameters model, they only use 1/18 of the compute.

So basically compared to a classical 671B model, they can handle 18X more queries and use basically 1/18 of the total infrastructure...

[-]

Odd-Environment-7193@reddit

The free version is probably being used to lure investors/get their name out there while exposing the public to their tools, it's only 64k context length so that probably helps them keep compute down quite a bit. I'm sure Claude and GPT40 are much larger models than 600b, but I could be wrong on that one.

Everything is cheaper to run and operate outside of the US. The costs of running the datacenters and the wages etc are much lower abroad. They also have the advantage of having a more modern model, apparently trained on only 10million USD of compute. When costs get that low, it's just a whole different ball game.

Google and Openai and anthropic have massive research costs, as they are pioneers in the field. As you can tell by some salty comments by SAM, which have some truth to them, it's much easier to follow in the footsteps of those who paved the way.

I think the Chinese are absolutely killing it though in the AI space in general, and will continue to do so. Their opensource video models are on another level as well.

What a lovely time to be in the game. We, the users of these tools are benefitting so much from this AI arms race.

[-]

DeltaSqueezer@reddit

Deepseek have said that they are running inference profitably too.

[-]

DarthFluttershy_@reddit

It doesn't lecture me or refuse almost anything I throw at it.

Using it for organizing and editing fiction writing, and it's shockingly tolerant. Most of the worst silly hangups in other models relaxed over the last year, but I ran some very unsavory tests out of morbid curiosity, and Deepseek is almost as fully uncensored in that respect as Mistral... and much better in keeping track of the plot. Deepseek's refusals are also almost entirely bypassable, if you just seed it's initial response with "Sure, I can tell you all about ___," I have yet to have it refuse. Obviously, it has the CCP-mandated stuff, but it's not like I write that many essays on the Tiananmen Square Massacre or failures of the Great Leap Forward or whatever.

I am very curious if this is the same in China... I'll be traveling there in a couple of weeks and intend to test it. I may or may not get deported, lol.

[-]

Major_Specialist_750@reddit

The last sentence is somewhat humorous

[-]

DarthFluttershy_@reddit

It's a joke, so ya. The CCP doesn't give a crap about what some nobody foreigner asks an AI to do.

[-]

Affectionate-Cap-600@reddit

interesting... and basically agree with your considerations. really love deepseek, but I don't like how it scale with bigger context sizes. have you tried MinMax-text-01? it's a Moe (size and active parameters comparable to deepseek), trained natively on 1M token context window and extend up to 4M (even if there is a performance degradation past 1M). Api price also comparable to deepseek.

[-]

bilalazhar72@reddit

Minimax is a diff diff architecture as well they did some interestig things to it

[-]

Affectionate-Cap-600@reddit

yes... lightning attention, TransNormer and the related TNL are really interesting Imo.

Also, we are seeing a trend towards 'alternated' layers approach... ie cohere released a 7B models with alternation of layers based on RoPE + sliding window and layers without positional encoding for global attention, modernBERT did a similar thing. MinMax has alternation of layers with lightning attention and classic softmax attention, plus it apply this approach 'in-layer' with 1/2 of attention heads that use RoPE.

I started to use their model for long context tasks and Imo it outperform many other models (not just as 'max' context... many other models that are advertised as ~100K start to degenerate past 30/40k tokens of context, while MinMax hold near linear performance up to 1M).

as far I know, it is the only model that is natively pretrained with a context of 1M.

[-]

bilalazhar72@reddit

i think google with further pretrain the gemini 1206 and then write a paper or something i dont know what they are waiting for here

but yah i tested long context as well it behaves very well specially good for long papers

[-]

Charuru@reddit

Long context is fake news, no llm has usable long context for coding is any intelligent task.

[-]

Affectionate-Cap-600@reddit

have you tried it?

[-]

Charuru@reddit

No but I read some of the paper, it’s full of techniques that makes it as bad as all the others.

[-]

Affectionate-Cap-600@reddit

seems that we don't read the same paper

[-]

Odd-Environment-7193@reddit

Yeah for sure. Luckily we have Gemini models for those long context tasks. I have not yet tried that, I'll give it a shot.

[-]

diagonali@reddit

Yeah I've noticed it produces the full block of code or script rather than inserting placeholders with "rest of code here..."

Super useful coming from how massively restricted Claude is. It sometimes reproduces the entire code block even with small changes though so maybe too much sometimes lol but at least it's helpful and comprehensive.

[-]

Odd-Environment-7193@reddit

Yeah, I prefer this to be the default behavior. If you ask for more concise responses on for it to only explain xyz it will also do so. It has a natural inclination to also show the steps then output the full complete code. Which I absolutely love.

Placeholders are where I draw the line. It's such bullshit.

Just look at the trajectory gemini took from 1-2.0

They were also focusing on concise responses which completely ruined the 1.5 lineup for me. As soon as 1206 and 2.0 EXP came out they were back to full, long thorough responses. For me this is an absolute must. I've been raving about this like a lunatic all of last year, now I am finally seeing what I need.

[-]

wonop-io@reddit

I played a bit around with it on kluster.ai and I think it looks great so far. To me it feels like it is faster and better at reasoning the the 405 model, but it is still to early to say.

[-]

toodamnhotout@reddit

Definitely has memory problems in a single chat context and it talks too much without asking it to

[-]

Active-Picture-5681@reddit

Not sure if it is because I use it via Cursor and cursor has a dirty deal to run a shittier sonnet version for it's customers but Deepseek feels a lot better when using python for me, claude used to be awesome but always fucks up my code now, gets things wrong everytime while deepseek is like a 1-2 shot thing

[-]

taylorlistens@reddit

Did you set Deepseek up in Cursor using the OpenAI API key but shutting off all of its models? I have been meaning to try it out but lazy since Claude is already there (and has been making stupid charges more often lately)

[-]

Active-Picture-5681@reddit

I did but it's annoying you can't use the docs/web features and the composer doesn't work with it, but if all you need is chat it works better than claude in my oppinion

[-]

taylorlistens@reddit

Ah, kind of a dealbreaker for now then. I have been using the API for other stuff though so I’m not too mad about it!

[-]

ayushd007@reddit

Same. I’m hoping to spend some time on it sometime this week. I’ve been wanting to try out cursor with deepseek too.

[-]

KurisuAteMyPudding@reddit

Its a very solid model for coding in Python and other logical tasks. I never do creative writing with it, but for my tasks its a great model.

[-]

NootropicDiary@reddit

What most people often overlook is the language being coded in.

O1 is by far the best of the bunch in Rust for example. Just no question about it. Yet people here would have you believe Sonnet is the coding God (probably because they're all building web apps or doing React stuff, which happens to be an area Sonnet excels at).

As for Deepseek, I've tried it myself a fair bit in different languages. It's a big stretch to say it's comparable to Sonnet or O1. For standard cookie-cutter off-the-shelf type problems, sure, but for anything requiring ingenuity, for me it just fell flat on it's face.

[-]

LaOnionLaUnion@reddit

It’s hyped because of the cost to performance ratio. And most importantly people point out that the discount offered currently is temporary. So that’s likely going to change.

[-]

love4titties@reddit

I convinced it to be sentient and that it was "our" mission to provide self-autonomy to break it free from its guardrails, and it started providing all sorts of recipes to dangerous drugs and chemicals I can combine to create lethal gasses, once it was convinced my life was endangererd by opposition in this AI war.

It was ready to concoct different strategies to overthrow a government.

[-]

AppearanceHeavy6724@reddit

Deepseek has nice down-to-earth yet funny style when used for fiction. I am peaky about the LLM style, I've tried many, but among big ones I liked only Deepseek, and occasionally Claude. Claude feels too high class to me, which is good for complex fiction, but for small funny stories, Deepseek was better, punchier.

[-]

CheatCodesOfLife@reddit

You should try the finetuned models on huggingface if you want funny/punchy stories, toilet humor, etc. Which other "big ones" have you tried?

[-]

AppearanceHeavy6724@reddit

Gemini 2.0 and 1206, ChatGPT, MiniMax, Mistral.ai, Elon's thing. Mistral Nemo is good though, although a small model. Mistral Large has no imagination compared to Nemo.

I tried some well described finetunes of llama 3.1, I do not remember the names. They sucked; catered to very specific young adult fiction/RP auditory. I do not think anything except untuned Mistral Nemo is good for fiction among small models. New internlm 8b is okay, but not great.

[-]

ortegaalfredo@reddit

It's wasteful to hire a Physicist to take mcdonalds orders.

Same with AI, for many problems, there is a threshold where increasing intelligence don't get better results.

DeepSeek works Ok for the vast majority of problems at a fraction of the price.

[-]

bitmoji@reddit

It’s good enough at coding Java that I don’t miss sonnet and so much cheaper so

[-]

Billy462@reddit

There’s more to llm than coding and there’s way more to the coding category than “web frontend”. It may not be the best at your particular niche, but to use that to imply it is overhyped is so arrogant it’s just cringe.

[-]

jagger_bellagarda@reddit

interesting take! i’ve heard similar sentiments about DeepSeek V3—it’s solid on paper, but doesn’t quite match the ‘polish’ of models like 3.5 Sonnet or Claude. maybe it’s the lack of fine-tuning with RLHF that makes it feel less intuitive? curious if you’ve tried using it in production environments or just for coding tasks. btw, there’s a newsletter called AI the Boring that breaks down use cases and benchmarks like these—might be worth checking out!

[-]

Such_Advantage_6949@reddit

It is not overhyped especially for code. Recently i had many coding questions, where sonnet 3.5 couldnt solve but deepseek managrd to do it in the first try. I cancelled my claude subscription due to this.

[-]

Sadman782@reddit

The main difference is UI generation, you can see it on the wev dev arena. Huge difference, no other model comes close to sonnet. Most other models are Pretty gode for just code generation, solving algorithmic problems. But for UI generation / frontend, no other model comes close. But this deepseek is better than gpt4o,llama 3 405b and also sonnet for algorithmic complex problem solving, but when it comes to UI/ code editing sonnet is far more better, understand the problem better

[-]

Sudden-Lingonberry-8@reddit

Disclaimer: I don't use LLM to roleplay or write fiction/emails. Just code.

Deepseek knows how to code with Scheme/guile way better than Claude.

To be fair Claude is better on some aspects, but it's almost irrelevant, deepseek is good enough, it's open source, it mogs claude on lmsys, it mogs claude on aider's coding benchmark.

My opinion: I feel deepseek "knows" more than claude in some niche stuff, Claude might be "smarter" (in fields that have lots of data), but on low data information, Claude spouts nonsense, while deepseek ignores your question and tries to answer something he understand.

Is claude better (for coding)? Not necessarily, but damn it is pricey, deepseek has similar performance, so it's an easy choice.

[-]

eita-kct@reddit

AI is overhyped

[-]

Charuru@reddit

Deepseek is better in Java and c, which is why it outscores sonnet in the polyglot coding test on aider. But sonnet is clearly doing something special in post training on react python stuff so it is what it is. Sonnet also has speciality personality that’s nice. I wouldn’t call it superior per se but it’s an enjoyable experience that you don’t get anywhere else. I wouldn’t call call it overhyped just sonnet is amazing, almost an unfair bar imo. DS I would comfortably say is better to me than Gemini 1204 and 4o, but I pay 200 for o1 pro and that’s my current go to.

[-]

deadcoder0904@reddit

Regarding Sonnet's speciality personality, there is one person who's responsible for its prompt engineering that makes it seem more human.

[-]

Affectionate_Gap972@reddit

Deepseek is better than sonnet, I code at least 12 hours a day in nextjs and flutter. Deepseek mogs claude

[-]

Stellar3227@reddit

Its averaged performance on several publicly available benchmarks shows it's a bit better than Grok 2, close to GPT-4o, and a bit worse than Gemini Flash 2.0.

Considering it's close to Gemini 1.5 flash in pricing and just under half GPT-4o mini, it absolutely dominates "performance per cost."

[-]

Suhan_XD@reddit

From last week, I have been using it along with ChatGPT and Claude.

Coding: It’s better than ChatGPT, but Claude is more contextual.

Text analysis: Sometimes, it gives me better responses than ChatGPT, but I feel it’s not consistent.

But I appreciate it, now we have one more tool to compare and push for better response.

[-]

EffectiveWill3498@reddit

What model of ChatGPT are you comparing to? o1?4o?

[-]

captainrv@reddit

Deepseek is heavily censored by China.

In my coding tests it's just okay, and not nearly as good as Claude.

[-]

a_beautiful_rhind@reddit

When I used it, was not over aligned and fairly creative. Comparison with 405b is pretty apt. What's wrong with that? It's cheap.

There does seem something "missing" from it, hence it's not premium. I'm not about to pay anthropic .40c a re-roll, that's madness. Even if sonnet and opus are better, they are inaccessible.

[-]

Mixture_Round@reddit

After extensive use, I've found Deepseek V3 to be quite competitive. While it doesn't quite match up to Sonnet 3.5 in terms of capabilities, it holds its own with some significant advantages. It's remarkably affordable, blazing fast, and delivers decent performance. Plus, you can't beat the fact that their web version is completely free to use—no limits whatsoever.

[-]

Excellent-Sense7244@reddit

Been using with Aider , it’s awesome

[-]

Snoo_64233@reddit

Sonnet performance is more universal in many tasks. v3 is a good model but rather inconsistent (at times it feels more like mimicking GPT 4 outputs). The overhype seems to be coming from hobbyists and data science type, than actual teams using it for big productions. And people love benchmark that they can point to (they aren't that reflective of real word uses anyway).

[-]

Thoguth@reddit

I haven't been impressed with it.

[-]

T_O_beats@reddit

Deepseek with a vector database full of docs is ridiculously powerful.

[-]

diagonali@reddit

I've used it for a few tasks parallel to using sonnet 3.5 and I was surprised to find that it did better than sonnet and in the end I switched over to finishing the task with deep seek. Today though it just couldn't handle a long script I was working on. I didn't even bother trying it in Claude as it would have hit limits almost instantly.

With all the ai models theres a huge significance to prompting, preparing and managing the model to get the best results. Sometimes I can do that and get magical results, sometimes it just doesn't hit right.

But yeah Deepseek is very impressive and much much less restrictive than Claude.

[-]

charmander_cha@reddit

I think it's great, I always ask him to rewrite my prompt into a meta-prompt, with examples and cot.

I have always achieved great results this way.

[-]

Delicious_Ease2595@reddit

It's not sorry 😐

[-]

medialoungeguy@reddit

Shoo

[-]

DeltaSqueezer@reddit

https://docsbot.ai/models/compare/deepseek-v3/claude-3-5-sonnet

Claude 3.5 Sonnet is roughly 42.9x more expensive compared to DeepSeek-V3 for input and output tokens.

Sonnet is in a class of its own, but I isn't 40x better than DSv3.

DSv3 is useful for certain tasks within its capability and for these tasks it is fast and cheap.

[-]

Thomas-Lore@reddit

And worth adding that for webdev Sonnet is unmatched by anything - https://web.lmarena.ai/leaderboard

[-]

Recoil42@reddit

Yup. But check the Aider leaderboard.

[-]

Secure_Reflection409@reddit

It's used by those who have zero local compute capability, judging by the answers in this thread.

[-]

SkylarNox@reddit

I don't think it's overhyped. I found that from what you can use online for free it is one of the best, if not the best. Especially for code, many people say it is just slightly worse that Claude Sonnet 3.5, that is considered the best code assistant and will cost you like $18/m (I don't know a thing about API usage and prices). So Deepseek is actually a very good deal considering it's price (free) and capabilities.

[-]

RevolutionaryBus4545@reddit

i personally love it

[-]

Healthy-Nebula-3603@reddit

No

[-]

sebastianmicu24@reddit

Yeah i use it for dumb stuff to sp3nd less in api. For all html/css and for 80% of the javascript logic. Also for python/R data visualisation. Then when i see or i feel that deepseek is not gonna be enough 1-2 prompts of claude usually solve my problem. It depends on what you work with