DeepSeek is pushing forward with $10.29 billion financing round, with Liang Wenfeng committing to continue developing open-source AI models rather than pursuing short-term commercialization goals
Posted by External_Mood4719@reddit | LocalLLaMA | View on Reddit | 89 comments
Imn1che@reddit
Doesn’t deepseek or their parent company do quant trading or something? They don’t need money lol
my_name_isnt_clever@reddit
Not needing money doesn't seem to stop anyone from greedily hunting for more of it these days.
PhysicalIncrease3@reddit
If they achieve their goal by continuing to provide better goods/services then I hope they do continue hunting.
my_name_isnt_clever@reddit
They don't, ever. They maximize shareholder value and tell the rest of us to go fuck ourselves via enshittification and political lobbying. What a weird stance to take, defending the people fucking us over.
PhysicalIncrease3@reddit
And yet here we are, with ever better models to work with.
my_name_isnt_clever@reddit
Yeah, just like where we were in the early days of social media, and cell phones, and computers, and western culture in general. Then after a bit of time passes, they dig in their claws to squeeze out every cent at the expense of all their good will and loyal customers. This happens over and over and over again, how are you not seeing the patterns? It will happen again until we collectively stop tolarating this bullshit. Glazing rich fucks isn't helping you.
PhysicalIncrease3@reddit
And yet...
Get better and better
Get better and better
Difficult to be objective here but clearly people worldwide quite enjoy both.
Social media is also clearly frequently subject to innovation. Tiktok, Instagram, Snapchat, facebook, MySpace. It's not a done deal that any one company or product rules the roost.
Genuinely don't understand what pattern it is you're seeing.
my_name_isnt_clever@reddit
You and I have very different ideas of what is good and what is bad. Having big tech controlled by a couple dozen absurdly rich white men is not a good thing. All you seem to care about is money, but there's more to life than capitalism.
PhysicalIncrease3@reddit
Those couple dozen are the ones that have been able to produce the goods and services we all enjoy. Should they not be rewarded for that?
Genuinely not trying to be pedantic, but do you remember intimately the leap forward Apple took us on with the iPhone? Just as one of many examples. Don't they deserve to earn big money providing such an obvious leap?
I also find your inclusion of "white" troubling. Would it really be so much better if a dozen black women had been the innovators?
my_name_isnt_clever@reddit
Yes, and Apple treats their customers like they at least give one single shit about them. I'm talking Microsoft, Oracle, Meta, OpenAI here. Those companies don't need more money, they need to be held accountable for their awful actions so other companies with any care for their customers can take their place. That's how the free market is supposed to work, clearly it's been broken for a long time now.
And yes, I would rather live in in that world because marginalized people have a wider perspective than the privileged. And they would actually treat women as people.
PhysicalIncrease3@reddit
Can you explain what exactly it is which differentiates Apple from these companies in your opinion?
If their customers cared about your perceived injustices they would not buy the products.
There are a number of successful female founders too, some black. Your argument lacks coherence and I suspect it isn't based on any real logic.
Zulfiqaar@reddit
In fact, they might even have made more money by placing strategic trades before releasing their models. DeepSeek-R1 was a shock to the stock market
Negative_Attorney448@reddit
...the entire point of being a quant trader is to get more money.
zenmagnets@reddit
Really wish Alibaba would do the same and release Qwen3.7 397b
FullstackSensei@reddit
Chinese AI labs seem to understand what the west AI labs don't: these things have a very short shelf life, and local inference isn't going to make a dent in whatever revenue you're going to make from a model. In reality, everyone is better off sharing their research, because that advances the field much faster than trying to scout individual talent and hope they can give you some sort of competitive advantage. And you can easily just restrict commercial deployment while maintaining a very permissive license.
I know being in this sub it sounds like everyone on earth is running a fleet of local LLM rigs, but the truth is, we're the 1%. It's not much different than going to the homelab sub and seeing everyone there has a 42U rack. The vast majority of people lack the know-how and the interest in running even a 9B model locally, even when they have the hardware.
OpenAI, Anthropic, Google, Mistral, etc can all release their models for download tomorrow and neither their revenue nor whatever perceived competitive advantage they have will change. Even whatever architectural advantages they have today, will have a 1 year shelf life at best. That's not much of a moat, and the Chinese AI labs seem to get it1.
NNN_Throwaway2@reddit
Except for alibaba, apparently.
-dysangel-@reddit
I dunno, Qwen are slowly stopping open weight releases
t_krett@reddit
Qwen3.6-27B was released 30 days ago..
Kitchen-Year-8434@reddit
I’m as worried as the next person about the shake up at Qwen but everything they’ve stated and signposted (with recent discussions about 3.7) indicate the exact opposite of your claim.
So: citation needed.
-dysangel-@reddit
Actions speak much louder than words.
Due-Memory-6957@reddit
And their actions have been releasing models.
-dysangel-@reddit
Let's list the models they've released since 3.5. It definitely will not take very long.
Kitchen-Year-8434@reddit
uhhhhh, 3.6? Which was a demonstrable improvement on the MoE and Dense models? In a very short period of time?
Nyghtbynger@reddit
Words from officials speak louder than assumptions from the commoners
-dysangel-@reddit
lol. No. If you've ever had the displeasure of reading a Bungie press release you'd understand the insane gaslighting that some PR depts try to get away with.
my_name_isnt_clever@reddit
They mean developers on X or whatever, not official blog posts that everyone knows are worthless. The alternative is purely vibes as there is nothing else to go on.
Nyghtbynger@reddit
I agree. The manipulation doesn't occur at the same level.
a_beautiful_rhind@reddit
Ahh yes.. releasing 27b and under vs the bigger ones. Very reassuring.
thread-e-printing@reddit
Correct me if I'm wrong, but the facts I see are that they ran a poll, the results favored 2 street food over 5 dim sum with a view, and the training farm was sown accordingly.
I propose a more sensible heuristic from the James Bond fandom: "Once is happenstance. Twice is coincidence. Three times is enemy action." As far as Qwen is concerned, we are barely at twice. No llama drama needed.
a_beautiful_rhind@reddit
Third time is gonna be the charm. Will we get a 397b at 3.7 release?
thread-e-printing@reddit
I assume that any larger 3.6 is still in the oven. Wait a few more weeks, or for news of other labs pulling back, then worry.
whitefritillary@reddit
or even a 122b-a10b
FullstackSensei@reddit
Meh?
A little over a year ago nobody cared about Qwen, or DS, or Kimi. I'm willing to bet this time next year there'll be another crop of AI labs releasing great models.
The thing most people seem to fail to grasp is that these things are greatly commoditized and the field is still moving very fast, because it's still very much in it's infancy. Nobody knows who's the next researcher and which lab they'll be at who'll figure the next thing that pushes the field forward. The only thing we know is: a few months after that happens there'll be several other labs that will replicate this advancement.
LosEagle@reddit
Excuse me? Were QwQ-32B, Qwen2.5 not revolutionary for local llms? Qwen was basically the pioneer of 32B local models. Before that, the industry seemed to be pushing for 70B models that basically died out.
whitefritillary@reddit
Deepseek-R1 wasn’t just “pretty close to GPT-4 in many areas”, it was essentially as good as o1 in the majority of areas and in some of them even better.
Cuplike@reddit
People forget, before R1, no other model showed reasoning, there wasn't a single model that did it before. It's the only reason why Gemini or Claude or ChatGPT models expose reasoning to this day
FullstackSensei@reddit
QwQ was what really put Qwen on the map. That's exactly why I said a little over a year ago. QwQ was released in March 2025. You're excused.
-dysangel-@reddit
For me it was Qwen 2.5 Coder. I tried all the small models that could fit on my laptop at the time, and that was head and shoulders above the rest. I was looking forward to the Qwen 3 series more than anything else, until GLM 4.5 came on the scene a few days later and completely outshone it.
mycall@reddit
There are many labs doing open source models that never get talked about here, e.g. OLMo
FullstackSensei@reddit
Yep, and they mostly don't get talked about because they haven't yet released something that's good at agentic coding like 3.6 has been. I'm sure they'll catch up in a few months and get a lot more attention
VoiceApprehensive893@reddit
Someone with cheaper electricity can just serve your model at a cheaper price
Releasing your proprietary model architecture to the public is also not a good idea
FullstackSensei@reddit
It takes a single line of text to stop that without hindering anyone else, which the Chinese models have been doing recently. I literally wrote that in my comment
xienze@reddit
This line? Chinese firms don't give a fuck about that, LOL.
thread-e-printing@reddit
Only paid trolls use that talking point
draconic_tongue@reddit
I have no idea what you guys are talking about, but whatever it is, if it's about non west people not being able to sell the same things for pennies you're automatically wrong. There are entire markets built on this principle, you're not going to undo it with a reddit comment. Russia does it, China does it, SEA does it, South America does it. Not only is there a market in their own countries, people in the west will also pretty much always go for cheaper labor and access to otherwise premium things that they'd have to pay a fuckload for in their own home.
FullOf_Bad_Ideas@reddit
Imagine the world where DeepSeek wouldn't release V2 and V3 that has MLA. Kimi K2.6 and GLM 5.1 wouldn't exist and would have architecture similar to Mixtral 8x7B or MiniMax. OpenAI and Anthropic models would probably be worse too, as I bet they use all of that open source research intensely.
FullstackSensei@reddit
It still amazes me how narrow minded people can be. Our entire world is built on knowledge sharing.
Every scientific advancement is built on the shoulders of numerous researchers and scientists before.
None of the LLMs would have existed without the 50+ years of published ML research. Attention, the very mechanism that makes LLMs possible was published 3 years before the attention is all you need paper. The very authors of that seminal paper wouldn't have been able to make their breakthrough had that research not been published.
En-tro-py@reddit
The iconic duo: Scarcity mindset -> Hoarding
FullstackSensei@reddit
Except you can't hoard people nor ideas.
Amodei himself was at OpenAI, then left when he had an idea to found Anthropic.
En-tro-py@reddit
You sure?
From what I see you can just overhire the people and privatize the ideas after academia and open source gave you them as stepping stones.
FullstackSensei@reddit
Hired people can and will leave.
Re Amodei, call it however you want. He had a different view/idea of how things should be done and left. People will leave. Hence, you can't hoard people.
En-tro-py@reddit
Agree to disagree - Companies like Meta, Google, and Amazon hiring hundreds of AI grads that are free to leave is still hoarding them...
That was their whole strategy, even with no immediate need yourself suck up the talent so that the competition can't use them.
xienze@reddit
I don't think that's quite accurate. Chinese firms could just host Opus for a fraction of the price. Anthropic would definitely feel that.
my_name_isnt_clever@reddit
I do think you're right, but we also don't know how much secret sauce is between their weights and their API endpoints. It could be that if someone had the raw weights, they wouldn't have the rest of the infra to match SOTA.
FullstackSensei@reddit
Why do so many people have trouble reading? This thing has long been addressed by so many AI labs
touristtam@reddit
I am a lurker with FOMO :p
a_beautiful_rhind@reddit
They're not worried about individuals. Companies are who run these models instead of API. Like small and medium sized ones. They actually hardly care about our individual APi usage at all. We are a drop in the bucket there too unless its 100% a consumer facing service. Even that is often a loss.
LegacyRemaster@reddit
Deepseek 4 is the most hallucinatory model ever seen. Let's hope it improves.
-dysangel-@reddit
Nice. For me, since GLM 5.1 we're now basically at "good enough" open source models in terms of intelligence for coding assistance. If we can just continue compressing that same intelligence level down into smaller/faster/more efficient packages then I'll be very happy.
Due-Memory-6957@reddit
My man, we always reach "Good enough" and then something else breaks trough and we're left behind. Development never stops
IrisColt@reddit
Same... I would like to be happy to plant the flag at "good enough" and call it a day, but...
-dysangel-@reddit
I'm not saying development will or should stop, I'm saying I'd be pretty content even if it did. AI has now completely automated all of the things I didn't like doing on my computer, while so far kind of sucking at the things I like doing. Eventually it should be able to do all the things I like doing better than I do them, but hey, this is a nice zone to be in currently.
Equal_Giraffe8866@reddit
If everything stopped today and I had to rely on my M-Discs filled with checkpoints and ggufs to rebuild civilization... I'd fail miserably. But I'd give it a good goddamned shot.
seamonn@reddit
If only it had vision
ridablellama@reddit
depending on how you access glm this might an option: https://docs.z.ai/devpack/mcp/vision-mcp-server
mivog49274@reddit
Oddly enough some days ago I would send screenshots to V4 Pro, but the feature of uploading file for Pro was restricted to Flash.
And yeah it perfectly read what was in the picture
dark-light92@reddit
It's coming. The DS4 tech report mentioned they are experimenting with other modalities.
seamonn@reddit
I was talking about GLM 5.1
mycall@reddit
Maybe not today but tomorrow
AykutSek@reddit
yeah and this funding makes that realistic. v3 paper had them at like $5.6m for the full pretraining run iirc, so $10b is roughly 1800x of that. they're not capital-constrained for years. expect more aggressive distillation going forward, everyone's been pushing on the small end lately anyway.
Django_McFly@reddit
I hate that if you're a millionaire, the investment network is a global one and you can get in pre-IPO, post-IPO, whatever, whenever.
If you aren't a millionaire, every country is totally siloed off and even if you're in the country, most financial regulators are quicker to let you gamble your life savings away than let you invest in a pre-IPO company.
PhysicalIncrease3@reddit
Being worth a couple of million doesn't buy you into anything. But at the end of the day if you're worth 100m and happy to drop in 5, you matter.
Negative_Attorney448@reddit
IPO isn't the only exit option, but you're still right that the barriers for alternative investments (privately owned companies, hedge funds, etc) is silly.
SourceCodeplz@reddit
At this point China just wants OpenAI, Anthropic to fail by releasing open-weights models so strong, that the US companies just don't have any edge anymore.
Then imagine when Openai, anthropic IPO to billions if not a trillion, and then that value just starts to vanish. Could be bad for the west economy. These IPOs will make their way into index funds, pension funds, everything.
meth_priest@reddit
always been like that. china originally released deepseek open source to disrupt the market. It was a sneak peek into a small model that could do what AI businesses were selling like happy meals. meanwhile china had already integrated variants of Deepseek in their own military and health sector. Even USA used Chinese llm for their warfare - although quickly banned after realizing how far ahead they were.
Basically every big company in silicon valley was working overtime by the time deepseek hit to figure out how it worked.
U.S is 100% dependent on certain rare minerals to make their AI infrastructure work (70% broadly speaking). China is not only disrupting U.S market - but also making them more dependent on them
I wouldnt even call this "conspiracy" - they've been doing this for centuries (opium wars, fent, etc). i say well played
DUNDER_KILL@reddit
Source for the US military using Chinese LLMs? I don't think that's true
Nyghtbynger@reddit
opium wars was a british invention
fent precursors are indeed produced in china. The smuggling has people in both countries
nullmove@reddit
DeepSeek according to OPs article: We want to research AGI, we also want financial security without compromising on open-source
Redditors: China just wants to undermine OpenAI and Anthropic!!!
Incredible narcissism and projection. Apparently nobody else is allowed to have their own ambitions or ideals outside of being obsessed with or otherwise revolving around OpenAI/Anthropic et al.
AdvantageStatus4635@reddit
good for them, and us as well
Comprehensive-Chard8@reddit
cool
ECrispy@reddit
I just saw their reduced price is permanent! This is the best news I've read, wish them all the luck. Imagine what they could do with access to gpus and without the anti competitive sanctions
WithoutReason1729@reddit
Your post is getting popular and we just featured it on our Discord! Come check it out!
You've also been given a special flair for your contribution. We appreciate your post!
I am a bot and this action was performed automatically.
mulletman9160@reddit
Sweet deal!
WebOsmotic_official@reddit
i like the “open-source over short-term commercialization” part, but the real test is whether they keep releasing the useful weights, not just papers and smaller distilled stuff.
$10b makes the AGI talk less funny, but open model people are right to judge by releases, not declarations.
mycall@reddit
Where is your moat now
a_beautiful_rhind@reddit
Yet we still have no deepseek flash support in llama.cpp.
Zeta1Reticuli@reddit
I wish I could invest.
Gailenstorm@reddit
In their last report, "We are also working on incorporating multimodal capabilities to our models.". Even with all the hardware "difficulties" they have with the sanctions, they still deliver.
https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf
Intelligent-Form6624@reddit
heck yeah