Dario's (stupid) take on open source

[-]

ArtisticHamster@reddit

I don't think it's that a stupid take. He basically says that models aren't open source in the sense software is open source. Which I believe is true. You could argue, that the most important part of the model is the training set, and the training techniques used to train them, which are often not described in detail, and usually not provided as code. As a result, you can't get the same benefits of diverse contributors as you do in the software open source.

Reply

[-]

int19h@reddit

There's no direct equivalent to software here. With software, free-but-closed-source means that you can use it but you can't change it (beyond intentional extensibility points), while open source means that you can use, read and validate (that source matches binaries, by building it), and change. With models, open weights ones can be fine-tuned, but without training set you don't know how it was made and what its knowledge base really is, so it's kinda in the middle between the two. The closest would be something like non-open-source app written in a language like Python.

Reply

[-]

chinese__investor@reddit

"because of the expon" the guy is incoherent, obviously on coke and retarded. Open models are open. Can be used by anyone and obviates the role of anthropic. Obviously many many people are contributing in many ways with open source models.

Reply

[-]

Decaf_GT@reddit

> Obviously many many people are contributing in many ways with open source models. Oh? Do tell. What are these contributions?

Reply

[-]

ArtisticHamster@reddit

>Open models are open. Can be used by anyone and obviates the role of anthropic. Who would train them to update to the current information? Do you have volunteers who would be happy to chip in with a couple of millions of $s to help with training runs? (I am pretty sure there're plenty of people who would contribute their coding/ML skills though) >Obviously many many people are contributing in many ways with open source models. For example?

Reply

[-]

ttkciar@reddit

> Who would train them to update to the current information? You've got me wondering what the limitations are of RAG, in this regard. It seems likely that there *are* limitations, and you couldn't rely on a 2023-cutoff model forever, but what would the limit look like? After work I'm going to try building a small "future-current" RAG database about a hypothetical 2030 social/political environment and see how Gemma3 fares answering questions about that setting.

Reply

[-]

ninecats4@reddit

As time goes on, distributed clusters are making open source weights and models bigger and bigger.

Reply

[-]

mapppo@reddit

Yeah but what he ignores is your personal data is what the consumer cares about where it wasn't as big of an issue with software (especially as these scale into full-time observers of our lives) -- having a US based closed source company, now with the NYT lawsuit forcing data to be kept, censorship laws already being put in place, and just the general level of fascism going on there -- anthropic can't compete on that front. I dont personally care that they rented a gpu, i can actually do that myself and not sell my data directly to palantir with it. And the models are better.

Reply

[-]

Pvt_Twinkietoes@reddit

With the number of daily users on chatgpt, clearly this isn't a problem for lots of users.

Reply

[-]

Pvt_Twinkietoes@reddit

Yeah I do agree with you. And what I get in this discussion is that he is talking about competition and they're not directly competing with open weight models and they're targeting a different market.

Reply

[-]

chinese__investor@reddit

He didn't say that at all

Reply

[-]

Pvt_Twinkietoes@reddit

"you know I've I've actually always seen it as a red herring when I see it when I see a new model come out I don't care 00:39:17.839 whether it's open source or not like if we talk about deepeeek I don't think it mattered that Deep Seek is open source. 00:39:23.359 I think I ask is it a good model? Is it better than us at at you know the things that that's the only thing that I care 00:39:30.320 about it. It actually it actually doesn't doesn't matter either way. Um because ultimately you have to you have 00:39:36.000 to host it on the cloud. The people who host it on the cloud do inference. These are big models. They're hard to do 00:39:41.280 inference on. And conversely, many of the things that you can do when you see the weights um uh uh you know, we're 00:39:49.200 increasingly offering on clouds where you can fine-tune the model."" I get that he isn't exactly saying that. But he don't see Open weights as a threat. 00:39:23.359 I think I ask is it a good model? Is it better than us at at you know the things that that's the only thing that I care The only thing that matters to him is whether they're better what they're doing.

Reply

[-]

chinese__investor@reddit

Once again you are claiming things he never said. Obviously he sees deepseek as a threat and that is also what he said.

Reply

[-]

Pvt_Twinkietoes@reddit

? And which part of the audio did he say that?

Reply

[-]

GortKlaatu_@reddit

I can easily make a private fine-tune without my data leaving my datacenter. The other aspect to consider is the vendor lock in. If you design a product around an open weight model, then it'll typically be more flexible when plugging in larger foundation models and being able to switch between providers. If you create a product around Anthropic and they suddenly close off access (like they did temporarily for Windsurf) then where would your company be then? Yes, you could find alternative routes for the same models, but still... Such moves should leave a sour taste in your mouth.

Reply

[-]

ArtisticHamster@reddit

>I can easily make a private fine-tune without my data leaving my datacenter. Yes, you could do it. But what if you need to update the foundational model to include the most recent facts? I believe middle sized companies, and small business won't be able to do it. >The other aspect to consider is the vendor lock in. If you design a product around an open weight model, then it'll typically be more flexible when plugging in larger foundation models and being able to switch between providers. There's an almost de facto standard interface to access any LLM, i.e. OpenAI like REST API. How could it be easier?

Reply

[-]

GortKlaatu_@reddit

I don't need generic facts though. I need business specific details which Anthropic doesn't have. I could also give it access to the internet for news and search results. Similarly, I can wait for another open weight release. As far as the API, it's not just the API. Each model has preferences of where instructions should be , where data should be, how explicit your prompt has to be etc. If you've tried the same prompt across multiple models, you've no doubt discovered very different results. When you read through the prompting guide you'll also discover that changing the prompt for the specific model will suddenly improve performance. If you solely rely on Anthropic-isms then you'll find worse performance on other models when you try to reuse the same prompts leading you to never want to switch.

Reply

[-]

ArtisticHamster@reddit

May be somebody create a better model which could update its information, but for now we have what we have (as far as I know, may be somebody have already solved this problem).

Reply

[-]

eloquentemu@reddit

Yes. People have forgotten that "open source" isn't the same as "free software". Classically, the GPL allows you to sell software, you just need to provide the source code to the customers. Open source was about hacking and ensuring software was usable even after support for it was gone. IMO, model weights are basically the compiled code, with the compiler being the training code and the source being the dataset. If I don't have access to the training code and dataset, then I can't reasonably modify the model and it's not open source. It's still free software, though, and that's cool.

Reply

[-]

Specialist-Rise1622@reddit

He's such an idiot

Reply

[-]

outdoorsgeek@reddit

I don't think the question was really answered because Dario spent most of the time basically explaining why he doesn't find it an interesting question. I disagree. My take is that foundation model company value comes down to 5 things right now: 1. Model architecture 2. Data collection 3. Training capability 4. Inference capability 5. Context (e.g. what can the model know about a user and the world at inference time). 1 is definitely sensitive to open source currently. The more state of the art architecture exists in open source, the less advantage any one company has. 2 is sensitive to open weights. The better the open weight models are, the easier it is to collect training data from the open weight models themselves. 5 is arguably already largely an open source-driven thing via MCP. That leaves 3 and 4. These are hardware problems currently, but we already have a rich history of hardware problems getting developed away into software problems. I think it's naive to think that the pathway that brought us from mainframes to personal computers isn't at least worth considering here--especially given the economic incentives. If these problems become approachable by software (e.g. distributed training, hyper efficient NPUs), enter open source again.

Reply

[-]

Previous_Fortune9600@reddit

I literally tried to listen to him for 30 secs and I came to the conclusion that he had said nothing so I stopped listening. Then I researched his background and found out that this guys has barely written any code in his life. Which is great for me to know as that helps me put a healthy discount factor to what he is always saying

Reply

[-]

SnooPaintings8639@reddit

Did he actually answer the question? I'd need an LLM to summarize and translate from CEO to English.

Reply

[-]

AndyHenr@reddit

He is of course afraid that his bs will be seen through: he is trying to talk about a technical moat as he want to raise more money. 'Red Herring' and other terms. So he means the bs he spews when he said developers will be pase by end of the year and other idiotics. He also talks very strangely, it is clear he tries to come up with some bs and lack the mental faculties for it.

Reply

[-]

Robonglious@reddit

Hopefully some discovery of methods can make training open source models more reasonable. The dude is not wrong. If I had the anthropic source code I couldn't afford to train it.

Reply

[-]

ArtisticHamster@reddit

>Hopefully some discovery of methods can make training open source models more reasonable. Even if that's true, what will we do with the datasets? My understanding there're armies of knowledge workers providing them. Could we replicate it with OSS approach?

Reply

[-]

RhubarbSimilar1683@reddit

That's what the people at outlier ai do. They are those knowledge workers.

Reply

[-]

ArtisticHamster@reddit

There're plenty of such companies. This is pretty expensive work, and it won't be easy to redo it in an OSS fashion.

Reply

[-]

Robonglious@reddit

Well, if we're open-minded enough we could speculate that training methods in the future could be much more efficient than what we're doing today. As an example check this one out: https://doi.org/10.1038/s41467-025-61475-w I don't think it's some magic solution but I believe there is some magic solution that we'll eventually find. Then the big question is, will that be open source? A lot depends on that answer.

Reply

[-]

ArtisticHamster@reddit

I very much hope it will be feasible to train a foundational LLM as a hobby at some point.

Reply

[-]

HauntingAd8395@reddit

I think the problem lies on: \- It’s hard to mobilise the mass’ capital to train a massively big open source models. \- Ideological divides between people, like, what political beliefs should our model has. \- Local LM is at most a hobby for most people. People probably will just create a very strong AGI model at the moment they see proof of AGI/ASI exist. Like a foundation would magically appear to provide exchange data for equity and centralize compute when time comes. It is just not now.

Reply

[-]

bilalazhar72@reddit

man this mf dario is absolutely losing his fuckinggg mind holy shit entire company premise \> custom instructions to model about a fake scenerio \> model trained on entire fucking internet \> model (more retarded then him) : i will blow this place up \> we need to slow down and have safety please stop all chips to china and open source too and then release sort of SOTA model you pay 20$ and you get 4 opus thinking queries before dario has to suck amazon dick again some physics people should never go out into real world and touch grass when they do this happens

Reply

[-]

Fun-Wolf-2007@reddit

Open source models push back is due to the big tech companies that want to keep models close source as APIs and subscriptions are their biggest cash stream

Reply

[-]

notdba@reddit

I would say local inference with open weight is especially important for coding agent, which does very few actual PP and TG compared to repeated cache read. This is what I got from a Claude Code session using Anthropic API: `claude-sonnet: 18.4k input, 100.5k output, 32.8m cache read, 1.1m cache write, 2 web search` Based on Anthropic API pricing, the cost distribution is: * input: $0.05 * output: $1.51 * cache read: $9.84 * cache write: $4.13 90% of the cost goes to cache read and cache write. And that's free for local inference. Just need enough VRAM to fit the context for a single user.

Reply

[-]

GortKlaatu_@reddit

He's poo pooing the implications of open weight models publicly and trying to create barriers for open weight models behind the scenes. Don't believe his lies, he's scared of losing business. The second I can reliably plug in an open weight model, I do so. Why keep paying foundation model prices when you find something cheap/free that works for a particular workflow?

Reply

[-]

ArtisticHamster@reddit

For me the main benefit is the privacy and control. I prefer not to let sensitive information leave my computers, rather than send somewhere over the internet where it could be used as training data. This is especially bad when it concerns confidential commercial information. We have breaches of sensitive personal information very often. I hope we will have no breaches of LLM user data, but I believe it's destined to happen at some moment, and it will likely be ugly.

Reply

[-]

EquivalentPie8579@reddit

$

Reply

Dario's (stupid) take on open source

Reply to Post

37 Comments

ArtisticHamster@reddit

int19h@reddit

chinese__investor@reddit

Decaf_GT@reddit

ArtisticHamster@reddit

ttkciar@reddit

ninecats4@reddit

mapppo@reddit

Pvt_Twinkietoes@reddit

Pvt_Twinkietoes@reddit

chinese__investor@reddit

Pvt_Twinkietoes@reddit

chinese__investor@reddit

Pvt_Twinkietoes@reddit

GortKlaatu_@reddit

ArtisticHamster@reddit

GortKlaatu_@reddit

ArtisticHamster@reddit

eloquentemu@reddit

Specialist-Rise1622@reddit

outdoorsgeek@reddit

Previous_Fortune9600@reddit

SnooPaintings8639@reddit

AndyHenr@reddit

Robonglious@reddit

ArtisticHamster@reddit

RhubarbSimilar1683@reddit

ArtisticHamster@reddit

Robonglious@reddit

ArtisticHamster@reddit

HauntingAd8395@reddit

bilalazhar72@reddit

Fun-Wolf-2007@reddit

notdba@reddit

GortKlaatu_@reddit

ArtisticHamster@reddit

EquivalentPie8579@reddit