Why are they releasing open source models for free?
Posted by wochiramen@reddit | LocalLLaMA | View on Reddit | 188 comments
We are getting several quite good AI models. It takes money to train them, yet they are being released for free.
Why? What’s the incentive to release a model for free?
Slaghton@reddit
I think part of it is most people don't have the machines to run llms or the knowledge to set them. Probably a tiny sliver of a fraction that use chatgpt have hosted their own llm so i guess its kinda like advertisement?
Green-Rule-1292@reddit
There could probably be a few different incentives but the strategy of "loss leader" is a very common one and it's also done by your local grocery store etc. https://en.wikipedia.org/wiki/Loss_leader
You know those food store print ads about unbelievably cheap potatos or whatever? The potato is not the product, it's the bait. They want you in the food store and for achieving that they are willing to pay a price.
For AI companies it's usually free marketing for the API side of their business as well.
tanzim31@reddit
Think about it this way: By releasing LLaMA for free, Meta gained significant goodwill, potentially even leveling the playing field in the competitive landscape
nonlinear_nyc@reddit
There a lot of openwashing out there. They’re not “releasing” it. They just have very forgiving licenses.
swaits@reddit
Watch the recent Joe Rogan with Zuckerberg. He explains it there. He believes in making sure that the most advanced AI capabilities are shared by everyone (ie governments, corporations) as this keeps things in check. He seems genuine in this and I’m buying it.
finah1995@reddit
Also plausible deniability of whatever is built with it, and having flimsy terms of use helps them
xchgreen@reddit
Value isn't always monetary and capital can exist in other forms. You have to understand that those giants are thinking decades/years ahead.
ilangge@reddit
Free release of open-source models is intended to counteract those charging for commercial models, or large corporations profiting from proprietary models. It also aims to prevent consumers from being held hostage by proprietary large models, ensuring consumer mobility and a diverse range of choices. This enables later entrants or new commercial companies that have not yet established a monopoly to have a chance.
FightingSideOfMe1@reddit
People do ablation studies for you
Pulselovve@reddit
For Meta, the situation is clear.
Meta understands that their business revolves around reselling content created by content makers.
Generative AI (GenAI) is a significant disruption to content production.
If GenAI algorithms dominate content production, the network effect value on their platform will diminish to zero.
The competitive edge will shift to those with the most advanced algorithms.
By integrating their own GenAI algorithms, Meta can control this new layer of the value chain on their platforms, without relying on third parties.
Additionally, this move allows them to preempt strong monetization opportunities for their competitors, ensuring they are not left behind by competitors with greater CAPEX capabilities.
Poromenos@reddit
This doesn't make sense. Meta is a distributor, why would they care about how the content is produced? And how does releasing the weights for free allow them to control the new layer of value more than if they kept their AI proprietary?
Pulselovve@reddit
Why did Disney care to build their own content distribution platform? Why did Nokia care about branding phones with carrier logos at the time?
Value chains are dynamic and change. The way profit pools are redistributed across different layers also changes.
Meta is valuable if they can attract all content producers through network effects (a key asset in the current value chain). If GenAI becomes the key asset, the value chain will also change.
Pawngeethree@reddit
Because there was a huge demand for it? Disney controls like 25% of media at this point, they’d be stupid not to monetize that directly.
Pulselovve@reddit
If you think hard about it. Disney did exactly what Meta did. They just integrated another step in value chain. For Meta AI is content creation, but they are a distributor.
Actually is more a first party tool for content creators, or with AI characters is first party content (like uncharted or tlou for playstation).
Poromenos@reddit
I'm not sure where the disagreement is then, because I agree with you that they're commoditizing their complements and making sure the value of content generation is zero, so they can shift that value into distribution. Maybe I misread your original comment.
synn89@reddit
Meta is more concerned about their AI "going away" if it was controlled by a third party, like OpenAI. So they need to create their own AI for their own use. But by open sourcing it, they get free work done on it by the community and the AI tooling being build gets built around Llama. So their AI becomes cheaper to create and manage than if it was closed. And since they're not selling it(that's not their business), making it cheaper is a win for them.
joninco@reddit
Greater capex capabilities than the same meta that blew 10s of billions on the metaverse? They are s-tier in terms of capex because Mark does what he wants.
Pulselovve@reddit
Meta doesn't want to invest everything in AI.
Pawngeethree@reddit
Few companies have more capex potential than Facebook, and none are direct competitors.
tgredditfc@reddit
Why do the same posts pop up every week?
Used_Conference5517@reddit
Why does every post, that shows up every week, get the same “why do the same posts pop up every week?” Comment?
Previous_Kale_4508@reddit
It's like the person who goes to church 'regularly', every Easter, and then complains that the priest only ever talks about Jesus being risen from the dead. 🤣🤣🤣
ResidentPositive4122@reddit
This has been the case ever since we've had BBSs, forums and so on. People discover a field and want to discuss certain things in waves. You got here earlier and have seen the same discussion. Some haven't and it's their first time. It will happen again.
Dark_Fire_12@reddit
Eternal September for those who were here for the previous September.
MoffKalast@reddit
They call it 'Eternal September'
PraetorianSausage@reddit
You're not obligated to read or respond to them.
KingsmanVince@reddit
Probably karam farming
throwAway9a8b7c111@reddit
If you get people to build with your models, and you charge then at the point of scaling (e.g. AWS bedrock) then you have people build tech with your stuff, and are making $$ whenever they actually need to deploy to meet any sort of real business demand.
If you get people like Grok/Cerebras.ai etc. building solutions that make inference/training etc. vastly cheaper and they do so highly optimized to your model, architecture, then you are saving a ton of money, while increasing the ecosystem of people whom build using your model, and creating a potential ecosystem of providers and customers.
Brand awareness. People aren't necessarily buying AI solutions in-droves quite yet, however the major players in the ecosystem are shaping up. There's a risk of missing out on a major business opportunity should you not get awareness of your "brand" in this space now.
Geopolitics. Chinese vs US vs Europe geopolitical issues and great power competition is a driver in this space, like it or not.
Maturity. Many of these things aren't "product-ready". As much as GPT3.5 for example was mind-blowing in it's abilities and capabilities on release. People struggled (and are still struggling) finding production use cases for the technology. This is doubly-so for models that don't have a considerable layer of "product" added on top of them. One of the "secrets" of a lot of models in the AI/ML space especially where linguistics is involved is that a huge benefit is gained not just from a few points one way or another in F1, but rather from how good the scaffolding around a model is in how it deals with outliers to input data, text processing, cleaning, managing history, routing, presentation etc. In most opensource models, none of this scaffolding is available. What this implies is that the companies who are putting them out don't really see a revenue opportunity (yet - hence the maturity aspect) in fully productizing these models, and as such they get pushed to the community in various licenses.
Investment. I've worked at two different companies doing the exact same thing, putting out the exact same product, in two different eras. In one era, we were starved, no one cared, there was no funding, no one believed anyone other than the government via contracting had any use for it or the products being created around it. in the other - with a vastly inferior product btw, money rained from trees like mana from heaven. The difference between the two was that in one temporal period the industry was dead, and in another the industry was hot. AI is in that space now, and as such getting your name into the space, putting out product (even if it's free), is leading to funding, and stock price gains etc. even for the biggest companies.
Used_Conference5517@reddit
I’m not all that happy that Qwen models are my favorite. I really don’t want to know what a company from a communist country did to get their data.
CrypticZombies@reddit
U gotta know if they advanced or not
Cover-Lanky@reddit
Consultation fees are no joke
Ok-Ship-1443@reddit
OpenAI “reasoning”
I have been thinking about the process of training and all and how some models take more time than others.
What if OpenAI has an immense vector db constantly being updated based on people search trends ?
Test time compute is really just rag/semantic search in multiple steps (the more results returned, the longer it takes to answer).
When I test it with code, theres a lot of time where dependencies are up to date…
The idea of having AGI feels like its bs because LLMs are just pattern recognition of next tokens. LLMs feel like they are not original at all.
LifeAfterRaid@reddit
It makes the public more reliant on LLMs it also makes it so if were all using them making laws to outlaw them is harder. All and all its a way for big tech to calude and pretend there not talking when they all use the same LLM to *Adjust there pricing* we didn't collude ai told us to do it but all and all they just fed all there data in and pretended they didnt collude when they all feed there data in then ai sets the price they can all use and stand by and they can claim innocence allot of other uses like this as well this is just one of many.
How ever if there all feeding there company data into the sae LLM's then the llms have proprietary data these companys normally wouldnt beable to share with one another anyway crinckle my hat is at it again but for real this is why sorry for my shitty spelling
FPham@reddit
Why is reddit for "free" ? Out of the goodness of their hearts?
Free means you are the product.
DamionDreggs@reddit
How does running llama on my own hardware make me the product?
vincentxuan@reddit
Did you know that our Chinese companies often sell goods at a loss? Like EVs, they are subsidized by the government. And their strategy is usually to squeeze out other companies at a loss to take over the market. At the same time, they often invest less in after-sales and sell user data.
On the LLM market, there are at least government subsidies, the sale of user data, and loss-making to squeeze out rivals.
ipanchev@reddit
Chinese people (the taxpayers) are being robbed bcs of the imperial ambitions of chairman Xi & Co.
NeoKabuto@reddit
At least in this case they still get the model to play with. They'll have them be open source as long as they're behind.
YearnMar10@reddit
That’s the same strategy eg amazon and tesla had. But also look up survivorship bias. In essence some big ones survive with such a strategy, thereby serving as an ideal to strive for, whereas you’ll never hear of those 1000s of other companies that fail with such a strategy. Basicsally, go big or go home.
vincentxuan@reddit
Foreign companies like meta, Mistral, I'm not sure what the reason is.
ResidentPositive4122@reddit
Mistral - advertising their capabilities, with the hope that eventually enough people will use their API instead of their direct competition. TBD if this was a realistic approach. It doesn't seem like it's working atm.
Meta - multiple reasons, including: limiting the advance of big api providers (oai, anthropic); attracting devs in their environment; creating awareness and acceptance around the field; using the community feedback and good ideas on their next iteration; meta's ultimate goal is to enable their models in a variety of roles on their other platforms. They'd invest there anyway, offering the small stuff for free adds the above benefits, without any major downsides.
Arvi89@reddit
It's open source, but it's expensive to run.
But you can pay per request to use these services, so they can make money that way, after you've tried their model for free.
Acceptable_Ad_2802@reddit
Meta in particular absolutely despises using third party anything.
Having worked there for several years, I noticed early on (and kept seeing it reinforced) that they suffer from "Not Invented Here" Syndrome.
They avoid hiring third party consultants (not individual contingent workers but companies to provide services) unless those services are far outside their core business. They'll hire a media company, or an outside security contractor, or a robotics safety consultancy, but they get weird about outside engineering - often to their own detriment (just because you have some of the best engineers in the world doesn't mean they're the best at a particular discipline - so they struggle with things that they're not *actually* the experts on).
They're fearful of any dependency on outside companies - they've tried multiple times to build game engines in-house so they could break the dependency on Unity3D in particular. Facebook Games used to be almost entirely Unity3D - the push to "Facebook Instant Games" being HTML5 was heavily motivated by reducing that dependency. Same with XR. They fear what could happen if Unity collapsed, was acquired by a competitor, or otherwise became adversarial. It's a pervasive concern.
They know how important AI is - they've done foundational work in it for years - and they've routinely open-sourced or otherwise published generously licensed code because there's no negative impact for them to do so. They need the product, they need to be able to hire people who know how to work with and develop on the product, and if that means they can hire people who already KNOW PyTorch, or llama-cpp, or have experience building with Llama-3.x, that lets them skip a difficult and time-consuming onboarding process. Nothing about that tech undermines their core business (which is leveraging the power of personal connections to place advertising.) Don't expect them to release open source or open weight models that make ad placement decisions or timeline recommendations. But Generative Music Production? LLMs?
It increases mindshare, gives them the power to shape the direction of an entire industry, AND diminishes the offerings of potential competitors.
It also has a bit of a "halo effect" and helps ensure that talented and motivated engineers are interested in working for them.
Microsoft doesn't want to be too dependent on OpenAI
I won't say as much about them, but it doesn't harm Microsoft AT ALL for everyone to have access to Phi-# or whatever, and they're so much bigger than "an AI company". They're going to continue to develop their own AI solutions because they don't yet own OpenAI outright and can't let themselves be too locked in to OpenAI solutions. Many of the arguments around attracting talent are the same for them as for Meta or elsewhere. I haven't worked for them, but they also seem to have a least a little bit of the Not Invented Here thing going on, but I don't think it's as strong as it is for Meta.
For them, as a cloud provider, they really want people to independently develop using their models, and then come straight to Azure for cloud services because they're all set up for it and ready to go.
TheInfiniteUniverse_@reddit
One thing is for sure, the moment they get super powerful, agentic powerful, they might get outlawed.
Clyde_Frog_Spawn@reddit
An edge in adoption.
Pretty_Afternoon9022@reddit
it gives attention to the companies who release them, which usually still gives them a lot of financial incentives in the long run
Everlier@reddit
This, it's "capture the flag" kind of game. Larger companies also see things strategically - having a foundation model as everyone's dependency is similar to having a browser engine as a base for everyone else's browsers, or having a protocol as a foundation for the one used by the industry, or a in internal library continued to be developed by OSS contributors for free, or having your UI framework become a de-facto standard for a specific platform.
IpppyCaccy@reddit
We can also look at the long list of cool technologies that Sony developed but ended up dying prematurely because they asserted their intellectual property rights.
Just because you have the patent, it doesn't mean it's always wise to enforce it.
_AndyJessop@reddit
Do you mean "land grab"? I think "capture the flag" is different as it implies a zero-sum game.
much_longer_username@reddit
I'm not sure how 'land grab' is better - it's not like we're making new land.
film42@reddit
Market share dominance is zero sum but the market itself is not. Intel, AMD, and ARM survived long enough to hold the rapidly expanding market. Zuck said open source brings the costs down and it’s clear he doesn’t want to pay OpenAI any time Meta wants to use this tech. I’m also pretty sure Meta is profitable with this tech so open sourcing is a win-win-win for them.
bonecows@reddit
I see your point, but I have the impression many players believe it's a winner take all situation, hence the companies trying a straight shot to ASI such as Ilya's SSI
_AndyJessop@reddit
I see that point two, but in that sense "capture the flag" would also be wrong, because it's simply about stealing and keeping something that already exists on another team.
If we're looking at game analogies I would probably go with Settlers of Catan, where players have to build and develop holdings until one of them reaches a set number of victory points first.
-p-e-w-@reddit
Soft power is incredibly valuable. Microsoft paid $8 billion for GitHub, and continues to operate it for free for the benefit of the open source world, and even invests massive amounts of money into improving it.
When the acquisition happened, lots of people were saying things like "they are going to demand money from open source projects in the future". No they won't. In fact, they keep expanding what they are offering to open source projects, for free. That's because they get mindshare and influence in return. It's the same with LLMs (and with VSCode, and with GSoC, and with many, many other things).
Responsible-Front330@reddit
I pay 10$/monthly for GitHub Copilot. They make money from me. Plus they have all the code data in the world from GitHub to train coding LLMs and sell them to us.
Admirable-Radio-2416@reddit
Github Copilot has free tier now too btw. Limited access though, it's like 2000 code completions per month and 50 chat messages per month.. But for some devs that could be enough.
Responsible-Front330@reddit
Cursor IDE also has a free tier Version. Even mit Claude Sonnet. And IMHO it is better the VSCode with Co-Pilot
Admirable-Radio-2416@reddit
Cursor is literally just a fork of VSCode though.. And you can use Claude 3.5 Sonnet in Github Copilot too.. So I just don't see the point of Cursor IDE, because why I would go for a fork of something that doesn't really actually add anything to the table?
Responsible-Front330@reddit
It is indeed different. It has a “composer” that can create a whole project with all the necessary files for you. I have not seen copilot on vscode working that “deep”. I code every day and I do feel the difference. AFAIK sonnet is not available on the GitHub free tier and it is free on Cursor (but I am not sure about that, I have copilot pro but I use cursor more often)
lessis_amess@reddit
what are you talking about? GitHub has a decent size sales team and is generating a lot of revenue from enterprise customers
aixzs@reddit
Same thing with Tailscale. Make money off businesses and let solos play for free.
skrshawk@reddit
Even better, those solo devs now have experience with your platform and with that comes more talent pool that knows your product. Those people get hired by companies and it becomes an easy sale.
emteedub@reddit
this has been a model for decades now - is the name for it "software conditioning"? anything from windows in younger years at (most) schools, office lineup, adobe... there's a lot
skrshawk@reddit
No doubt a lot of people have pirated a lot of Microsoft and Adobe products over the years and those people now are staples of the IT and design industries.
brimston3-@reddit
Considering how little it costs tailscale to broker anything but TURN-style clients, they can probably scale to some insane level of clients on not a lot of revenue.
aixzs@reddit
Exactly
ResearcherDense9962@reddit
Yeah they take a percentage from the github sponsors program.
Ambitious_Subject108@reddit
They literally pay the credit card processing fee out of their pocket for individuals. They just take 6% if you want to be sponsored as an organization.
ResearcherDense9962@reddit
You're charged some fee when it's sent to your bank account. You don't get it all
-p-e-w-@reddit
The open source users aren't (directly) generating revenue though. They are getting the service for free, even though it costs GitHub money to provide it. This is analogous to how some companies provide the weights of LLMs for free, while also offering those LLMs as a hosted service.
morfr3us@reddit
Or they wanted to train their models on most of the worlds private codebases and $8B is cheap for that
-p-e-w-@reddit
The vast majority of high-quality code is open source already. In no universe is the ability to train an LLM on a bunch of garbage corporate codebases worth $8 billion. Nobody who has actual trade secrets of any value in their code hosts it on GitHub.
morfr3us@reddit
Being a public repo on github does not make code open source.
daaain@reddit
They can not train using data from private repos, at least not from paying customers for sure. I can't be bothered to read through their current privacy policy, but would be extremely surprised if it gave Github or MS access to private code.
morfr3us@reddit
Yeah same, I can't be arsed with looking through the small print and I'm sure the deivil will be in the detail. I would be very surprised if they were not using that data to improve their models even if not directly/ straightforwardly but it's just my speculation.
brimston3-@reddit
It opens them up to so much liability if copilot regenerates some code from a private corporate repo. It doesn't make sense from a risk management perspective.
Now if they used the private repos as a validation set somehow and the public as a training set, and never the twain shall meet, then yeah, I don't think the private repo owners would ever be able to tell.
RedditDiedLongAgo@reddit
GitHub fucking rakes in money and has for a deacde.
GTHell@reddit
I see many growing startups doing all kinds of PR. I never knew why until they went public and realized that all those PR stunts attracted big investments.
Ok_Phase_8827@reddit
nice
Ardalok@reddit
This gives other people the opportunity to work on your model for free.
NickCanCode@reddit
If you can't win on direct competition and dominate the market, you destroy the user base of your opponent so they won't win either.
AnomalyNexus@reddit
That's definitely Meta's game plan. It can't really disrupt their business model...but it can sure fk with a certain competitor that gets ~90% of revenue from search
Enough-Meringue4745@reddit
Meta has a history of open source and open contributions as well. It keeps them at top of mind in the minds of engineers who ultimately decide what tech is used.
Roshlev@reddit
Didnt llama only get open sources after it leaked?
-main@reddit
It got leaked because they were distributing it pretty widely to researchers and people who called themselves researchers. It's still not open source or open data. It's 'weights available'. The software equivalent of actually giving you a binary.
emprahsFury@reddit
All the major tech companies have a history of open source contributions even Larry Ellison has found a business case for it.
Fleshybum@reddit
They are talking about React as a way of saying, "this isnt just about Open AI" or disrupting competition.
Final-Rush759@reddit
That's for open research to improve AI models faster. That was the whole idea of open ai. But companies are pulling back. Some companies still release the open model weights. Image Google didn't publish Transformers.
Melancholius__@reddit
That "Attention Is All You Need" was a eureka moment or else we'd be at square zero
JamesSmitth@reddit
They are free for personal use only.
Aggressive_Ad2457@reddit
They are letting the 'cat out of the bag' early so that later (agi?) it can't be easily curtailed by governments. Imagine if all ai was only available via five or six endpoints from a few big players, governments could easily legislate it's use, now they can't because every tom, dick and harry can run an AI. They know it's very early in the game and the big bang is coming later down the line. In my opinion...
irve@reddit
Yup. I think it's the copyright thing. You train on public stuff and give it to the public so you can sort of have allies in your fight to continue using he public stuff.
pc_g33k@reddit
And pirated stuff, too. 😉
victorc25@reddit
Why not?
ortegaalfredo@reddit
You give knifes to everyone so the guy that invented the knife don't get cocky.
kjerk@reddit
And buy up the bandage stock
waescher@reddit
You might not be able to win 1:1 against OpenAI and Google but you can be the peak of the Open Source AI community which uses your tech to advance together. You get eyeballs - maybe investors, but most important of all: talents that feel it’s the right thing making AI for the rest of us.
Feztopia@reddit
Because of all the tools and research they get for free. They make their own architecture the industry standard.
Jdonavan@reddit
Because nobody would use them otherwise
Aggressive_Chest_455@reddit
Why not?
Unnamed-3891@reddit
Because the model itself is not their product.
TheTerrasque@reddit
Bingo. Especially for meta.
They just want to run the models, and now they get free development and testing and experimenting with training and new architectures.
alby13@reddit
Why Meta is giving away its extremely powerful AI model
https://www.vox.com/technology/2023/7/28/23809028/ai-artificial-intelligence-open-closed-meta-mark-zuckerberg-sam-altman-open-ai
Thistleknot@reddit
market attention (id say share but this is an early strategy before they monetize). think netscape and Firefox.
likely will build services on top of their free offerings such as agents and hallucination detection (hypothesizing)
Slight-Ad-9029@reddit
Most of them are free to users but not to enterprises of a certain size
rzvzn@reddit
I can't speak for the big dogs, but Kokoro went Apache for a few reasons. One of them was to acquire voluntarily contributed synthetic training data for the next model, which I otherwise would not have been able to obtain.
Also, Kokoro v0.19 cost $400 to train for about 500 GPU-hours of A100 80GB. While this is a lot of money, it's lacking a number of zeros from the level of money they're setting on fire to train LLMs. I'm lining up the next training run, and my current estimate is that total cost (including the aforementioned $400) should remain three digits. And yes, that model will be Apache too.
synn89@reddit
Because Meta doesn't sell AI, they sell your data and need AI to help with that. If they use a third party AI backend(OpenAI), it could cost them billions if that goes away suddenly. By creating/releasing their own AI model they're both securing their infrastructure and making sure it's state of the art, since the open source community will improve it for them free of charge. Also their tooling(llama), will end up becoming the defacto open standard which means their AI model becomes easier to work on and manage internally.
saosebastiao@reddit
I don’t care why they do it, as long as they live rent free in Sam Fucking Altman’s shitty head.
False_Grit@reddit
Google is "free" too. Controlling people's minds (through what advertisements and web links they are shown) is real ultimate power.
KingsmanVince@reddit
Because it's a tradition in ML research. Your research was based on someone's models and data. You should return the favor by releasing yours too.
parzival-jung@reddit
why are you getting downvoted? I upvoted your response for visibility, hope other people can share their thoughts on this argument. I want to believe this is the reason but unsure what I truly believe.
bigattichouse@reddit
If a small shop can put out a good model, and raise some VC funding, they can plan to be acquired by a bigger player later. it's all about ROI for the VCs. Putting out the free model also gets you market share, as companies use your model while working out the kinks of using models in-house. So you build some clout for your team, build up a community of users who prefer your models for whatever reason, and provide a juicy exit strategy for your VCs.
theincrediblebulks@reddit
Big tech understands being beholden to another tech company is a golden handcuff. Yes they may help with distribution but it does not work for the greater good when there's a misalignment of interests. Meta had a sour relationship with Apple when they take a cut off their revenues from the app store. Further they also start being th6ere walled garden where everyone plays by apples rules when it came to piracy which affects met's bottomline. Now these foundational LLM models are going to be the primary surface of interaction with a generative AI for hundreds of developers and millions if not billions of users. By releasing it for free, meta invites a ton of developers to openly build products using that this becoming a bigger part of a product's powered by generative AI
r2994@reddit
Meta has a social media monopoly. They risk nothing doing this
unrulywind@reddit
Models are not worth a particularly lot of money as long as they are being eclipsed by better models within weeks. There will eventually come a times when they achieve models that can be used for a long time and those will be monetized differently. Right now the real value is datasets (to make ever better models) and the research to get the best model first. Meta has said publicly that they would have never caught up like they did if it wasn't for all the things they learned from all the people playing with, and even breaking, the models. In a way, the "free" models are your pay as the QC tester.
trill5556@reddit
These models train on output of one another. THey have to be open source legally
HomoNeanderTHICC@reddit
Just some guesses here (I am not at all an expert)
Releasing a model as open source can get a company thousands of free testers which could all tell the company exactly where they need to improve their model, and using that feedback the company could then improve the model up until the point they decide that feedback and improvement is less valuable than the model currently is.
It could also get in the way of any potential competition. When Meta releases an open source AI model completely free of charge, suddenly a lot of would-be competitors don't "need" to invest in the development of their own AI models. That allows Meta to develop their private AI models and get a significant advantage since the competition is using an inferior AI system since it's easier and cheaper.
human_obsolescence@reddit
this is another part of the equation that I'm honestly very surprised that more people aren't mentioning. I think more folks need to do a review of the benefits of open source (or open weights in this case) and why it's important.
a lot of the benefits are mentioned in the famous "we have no moat" memo, and are applicable to (F)OSS in general:
https://semianalysis.com/2023/05/04/google-we-have-no-moat-and-neither/#we-have-no-moat
I think the "scorched earth" idea is less of a factor than people think, and/or it's incidental -- consider the number of people able/willing to run local LLMs compared to people who just use/buy the big-name API stuff. The fact that people are basically doing free development and basement hacker-style innovation can't be ignored.
pjdonovan@reddit
The Honey documentary has really been effective!
thealphaexponent@reddit
The race for the best models is also a race for talent. Strong talents want to work with other strong talents. Releasing open source models can showcase the firm's capabilities and attract strong talent.
Areign@reddit
Most of the comments are missing the biggest reason.
https://www.reddit.com/r/MachineLearning/comments/137rxgw/d_google_we_have_no_moat_and_neither_does_openai/
if the open source community is going to do the work anyway, better if they do it on your model/ecosystem.
Johnroberts95000@reddit
China - for the same reason they aren't nerfing drones. Facebook - because Zuck is chad again.
shakespear94@reddit
The “AI” is not ready. Not even close to autonomous thinking. It is still very manual. Having free models and chatgpt chat instance/claude chat instance all feeds data back.
Meta’s approach is slightly different, I tried monitoring my traffic when using my 1B model. It wasn’t sending data back. But i noticed the amount of commits. I mean, that is true data. Everyone collectively training their version and meta releasing a 90B model.
Which is all great.
marvijo-software@reddit
We are the product. They use our data to train subsequent models. Most companies have this clause in their terms of use clause, like the free Gemini 2 Flash Exp
o5mfiHTNsH748KVq@reddit
Simple: https://en.m.wikipedia.org/wiki/Scorched_earth
No_Swimming6548@reddit
Could you please elaborate more? To me releasing an open weight model is more like planting free food rather than burning it.
MoffKalast@reddit
They're oversimplifying, this writeup goes into more detail on the strategy.
ForsookComparison@reddit
Could they really be doing all of this to take stabs at OpenAI and the sort from becoming the new mega companies?
amejin@reddit
MS releasing Phi doesn't align with this idea. It may be for some, but not all.
MrMisterShin@reddit
I was looking for someone to mention this. I’m glad you did.
Legumbrero@reddit
My speculation:
It makes quite a bit of sense for some of them, such as Meta. Meta's baseline Llama model would likely not be competitive with GPT4 or Claude if released commercially, in my opinion. At that point it would be seen as a flop, bleeding the company money with nothing to show for it.
Instead they have the de facto standard LLM for open source research, which gives them two key things: free R&D (which helps them catch up) and the ability to control a major platform. My understanding is that after butting heads with ios, control of the platform is huge for Zuckerberg. As Meta uses AI more and more in advertising, this could prove to be a useful bet.
For other LLMs, such as those coming out of China it can perhaps be seen as a state-subsidized effort to be seen as on-par with the west. This goes beyond having an LLM that gives you certain answers to Tiananmen square (or future truthfulness around Jan 6th if you want to flip it around) and in my opinion is more of a global play to frame AI development as a two-pole arms race vs US-controlled hegemony. This could be advantageous in a world where the rest of the world might be trying to decide on Chinese vs US-based solutions (for those mid-sized countries for whom investing from scratch does not make sense).
This is all speculation on my end. I have less clarity on why Salesforce, for instance, makes their finetunes available (but since they're not full training iirc, it does not cost them that much -- so maybe it's just free PR).
UniqueAttourney@reddit
it's for talent recognition (saying i am good too, without having to prep for private meetings),
land grabbing ( i was here first kind of, even if you are duplicating the work of others but in different regions or fields),
low level platforming (if people use your model and successfully create a product, they are now tied to your platform)
R8nbowhorse@reddit
The same reason they made pytorch open source, or google open sourced kubernetes: To get community buy in, become the de facto standard and capture majority market share as a result.
Ofc the details are more complicated, but that is usually the angle.
Mashic@reddit
Let's say you have a company that makes an office software like word, excel, powerpoint. If you want to make profit and sell it for $100, why would people buy it instead of Microsoft Office, wich everyone uses it, and can open most exitsting files, and you can share the files with others easily.
So what do you do? You offer a product with limited functionaliy for free, you hope that poor students use yours, and when they graduate and start working in a company, when they want an office software with more functioanly, they'd buy the one already know and comfortable with.
Same with AI know, everybody in the generic public knows and uses ChatGPT, there is little incentive to go for other AI models. So what these other companies do, they offer free local models for the more techie people, hoping that they'll use their commercial version in the future since they know how it works.
MagmaElixir@reddit
The models we think of as 'open source' are really only 'open weight', such as Llama: https://www.zdnet.com/article/meta-inches-toward-open-source-ai-with-new-llama-3-1/
In large language models, "open source" means providing full access to the model's source code, including architecture, training algorithms, and hyperparameters, allowing for complete transparency and modification. "Open weights," however, involves releasing only the model's trained parameters, enabling usage and fine-tuning without revealing the underlying code or training data.
For anyone wondering what the difference between 'open source' and 'open weight' is, I found this blog post which does a decent job explaining: https://promptengineering.org/llm-open-source-vs-open-weights-vs-restricted-weights/
NovelNo2600@reddit
They are not just open sourcing the model, along with they are providing MAAS (Model As A Service) and thats the reason every ai companies which are providing the MAAS are just shifting/shifted to openai compatible API endpoints
Admirable-Radio-2416@reddit
Lot of the models are fairly small compared to the models some of the big companies actually end up running. Think it more like a demo-version of the actual thing. Like others pointed out, it gives attention to the company.. And with attention comes possible funding, investors and so on.
The_GSingh@reddit
Look at what mistral did. Released some of the best open models of their time, became a unicorn (means they got upwards of a billion in funding) and then became a closed source business selling access to their ai models.
Had they not done the initial open sourcing, there’s no way people would’ve just handed them a billion. In the long run for startups it gets them more recognition.
For something like meta that doesn’t need recognition or funding, it gets them the goodwill of users. Even tho the meta llama and google Gemma models aren’t the best now, when they were released (and good) people were actually grateful towards zuck lmao.
Plus it helps meta get feedback easily and the open source community will continue to work on those models improving them without meta having to pay for any development unless it wants to.
Only-Letterhead-3411@reddit
To gain popularity and attention
To have people work on creating projects for their model for free
To have people find use cases for it, discover it's weak and strong points
To reduce user amount of their rivals
jman6495@reddit
Just a heads up: Llama is not Open Source
ForsookComparison@reddit
"Open Weight" feels so weird to say but ive trained myself finally.
nix_and_nux@reddit
There's a material cost advantage to being the standard, and the fastest way to becoming the standard is to be open source.
There's a cost advantage because when new infrastructure is built, it's built around the standard. The cloud platforms will implement drivers for the OSS models, and so will the hardware providers, UI frameworks, mobile frameworks, etc.
So if Llama is the standard and Meta wants to expand to a new cloud, it's already implemented there; if they want to go mobile on some new platform, it's already there; etc. All without any incremental capex by Meta. This can save a few percentage points on infra expenditures, which is worth billions of dollars at Meta's scale.
This has already happened with Cerebras, for example [link](https://cerebras.ai/blog/llama-405b-inference). They increased the inference speed on Meta's models, and Meta benefits passively...
ToHallowMySleep@reddit
This is a general FOSS question and not specific to AI.
AnnaPavlovnaScherer@reddit
What are the key really good local LLMs?
evia89@reddit
rag, summary, auto complete, tts, stt (whisper), finetune small model to do specific job like classify your data
Prashant_4200@reddit
I believe most of the companies who release their AI model are free like Meta and already reached some bigger goal like when mera releases llama they might already complete their llama 2. So there is no financial loss for them also everyone starts talking about them and starts using their model rather than building their own
nixudos@reddit
Only a small fraction of people in general are actually able to run LLMs locally, so it doesn't affect the paid services in any meaningful way.
Open eights also means that the community helps boosting innovation and new ideas, that the companies can then use or elaborate on. RAG and basic COT was first seen in the community and is now a part of the models/services of paid services.
And it is a good way to get people who are into LLMs to explore certain models and then maybe commit to those models as a paid service in their professional life. I use paid APIs exclusively at work, s there is less hassle and prices are so low on decent model. But a Gemma 27b might be able to do 75% of the workload. Just not worth it with setting up hardware, balancers and so on.
acc_agg@reddit
Facebook doesn't want another iPhone moment where their most valuable customers are in a walled garden they don't control.
a_beautiful_rhind@reddit
People will try and use their models. Then they will pay for the ones they can't run or other services. It's money.
Curious-Yam-9685@reddit
you dont like monopolies right? you dont want one company to have the only super smart AI platform right? you want it decentralized right? you want these super smart models to become cheaper and more efficient so you dont have to be filthy rich to afford to run one?
LostMitosis@reddit
Who would have known about Qwen or DeepSeek? In 2027 when Qwen or DeepSeek launch some paid service they will have a significant number of users (who now know their capability)ready to open their wallets. Its the oldest trick in the book.
dsadggggjh453ew@reddit
Spyware
Beneficial-Ear8565@reddit
Just a way to chip away at your competition’s margins
inagy@reddit
They are not open source though, just open weights. It would be open source if we get all the training data, and configuration, so we can re-train them ourselves if we really want to. But we cannot do that.
mandle420@reddit
they also benefit from community contributions this way. less work for their devs who they have to pay...
Poromenos@reddit
Because it raises the cost for any competitor. As soon as you want to create your own LLM, you now need to compete against the (very good) Llama to even enter the game. This disincentivises new players and concentrates power to the few existing companies.
Plus, all the other benefits people here mention, it attracts great ML people, advertises your company, etc.
Inevitable_Fan8194@reddit
Same as with Open Source and Free Software: the first ones do it because it's the right thing to do (you know, science is supposed to be open?), the following do it for the street cred because it allows them to join a prestigious club.
SixZer0@reddit
Yeah, in my opinion the ONLY 2 thing contributed to world developments are open source software(OSS) and big companies OSS-ing stuff.
I think if we think about most startups and entrepreneurs they would say most of their codebase are snippets or ideas from opensource codebases or modifications of those.
Own-Potential-2308@reddit
To manipulate facts and steer the narrative
Guinness@reddit
I disagree that companies are releasing them for free for what effectively amounts to PR. While there is a minor benefit to this PR, what is of greater value to them is developing an industry they can then exploit.
For example, Llama is not actually free nor open. Facebook basically allows all but the top major corporations to use it for free. I forget specifically which, but I think it’s Fortune 100 companies are not allowed to commercialize products around their models.
By releasing Llama, they’re creating a Linux “like” industry. They’re hoping that their models become the defacto open standard and thus companies are forced to use them, or become large enough to be forced to pay them.
Suckerberg for example created Facebook on a LAMP stack. Now imagine if the LAMP stack required licensing once you hit a certain size. Now Facebook, which is worth billions of dollars, now has to pay $2 billion per year to Linus.
It’s actually rather smart because it’s almost a way of getting in on the ground floor of every AI startup as an equity owner. And then once that company hits a certain size. Well, you COULD sink billions of dollars into recreating Llama, or you could just pay Facebook.
zhdc@reddit
Strong signal to venture capitalists that they're not producing vapor-ware.
For established companies like Meta, they're a way of preventing OpenAI and Microsoft from building a competitive ecosystem/moat around ChatGPT.
Don't forget about talent acquisition. AI and other fields (robotics) are moving - very - fast.
lapups@reddit
open source models or actually any other products allow regular people to improve those for free
in terms of monetisation there are many indirect options
Bio_Code@reddit
It also helps getting the cost down. Smaller businesses are getting their hands on these models and build their own products based on them and doing their own research. That results in tools like unsloth which makes model finetuning as cheap and as fast as possible. Meta and others can learn about their techniques for nearly nothing and adapt that for other projects. But that is just a small reason.
DarKresnik@reddit
They are releasing very good free models for free. Imagine which models they have for they own!
tekonen@reddit
You could watch the explanation to this from this YouTube video talking about strategy, value stream mapping and different evolutions of technology.
https://youtu.be/L3wgzl2iUR4?si=h2xV20HFS8jc6Ks_
JoJoeyJoJo@reddit
Undercuts competitors because it’s the early ‘territory grab‘ period of thrips new market and the fewer people dividing it up the better. It’s hard to compete with free.
jonastullus@reddit
I think it's this. Similar to Google making Android free-ish. It diminishes the market share that commercial companies can grab, and leaves the door open to bring out a commercial product later.
Also, it gives them eyeballs and feedback ln their system. I am sure that Meta has received a lot of value from people interacting/ building on Llama models, which they wouldnt have if the model was inhouse-only.
Also, it might attract talent. Promising Ai developers may be more inclined to work on something visible, than a secret inhouse-project.
allegedrc4@reddit
Considering about 2-3 years ago Meta looked like it could go under and now people talk about them constantly, I would say it turned out pretty well for them
Better-Struggle9958@reddit
1) Competition, yes, the paid models market is already occupied. 2) Big models don't work on most users machines, so these companies will earn either by selling capacity for big models or on user data
KnownPride@reddit
To push adaptation and improvement.
HedgehogGlad9505@reddit
When you have the best open source model, people are going to do research based on your model. Then you get their results for free, and you can catch up with the best model with less R&D cost. Otherwise there'll be a lot more try and error.
PermaMatt@reddit
Undercuts the competition and gives them leverage/narrative to push back on regulation.
With that, and this can sounds pedantic but it's an important point, they aren't doing it in the spirit of Open Source let alone Free Software.
If you want to know more about that, let's chat but know it's a bit deep (political) and opinionated (specifically not capitalism). :)
naaste@reddit
Do you think part of the incentive might be to drive adoption and innovation in the community? For example, frameworks like KaibanJS leverage open-source models to help developers build multi-agent AI systems. Open-sourcing models could also attract contributions and expand their use cases
tomekrs@reddit
In case of Meta: they don't know how to monetize and it deflects any accusations of profitting on the non-public data of their platforms' users. Also Zuck really believes in open source.
Orolol@reddit
Because there's no point in using a model that is not SOTA or cost efficient SOTA. For example, there's no point in using Qwen Coder when there is Sonnet 3.5 available. BUT, by making Qwen open weight, suddenly the model become fare more useful, you can run it locally, everybody can host it so the price of the API are crazy low, etc.
For people that are willing to use API and pay for a model, they mosly want THE BEST model for their bucks.
AllHailMackius@reddit
I read that from Facebook Llama at least, it is partially to stay relevant and to stop competitors gaining an advantage and then creating walled gardens that FB must comply with if they want to be included in the AI game.
Arcade_Gamer21@reddit
Because open source allows others to basically train and fine tune their Ai for them for free,which then they use on their own products and not to mention open source models bring in more investor cash then proprierty
jp_digital_2@reddit
What do people think about llm as a tool to spread your ideology / propaganda / cause (good / bad doesn't matter for logical purposes).
All you need is to tweak the "weights" and "biases".
newreddit0r@reddit
Sometimes you can win by making everyone else lose.
name_it_goku@reddit
That's how open source works brother
Minute_Attempt3063@reddit
Because it makes OpenAi have less control.
And there are other ways they have made their money out of it
satoshibitchcoin@reddit
Zuckerberg wants to replace his main cost center (devs) with AI models.
Comprehensive-Log804@reddit
Free version today, paid version tomorrow.
Thomas-Lore@reddit
If your model is not SOTA then it is already outdated anyway, and if it is SOTA, it will be outdated in a few months. So why not rleease it?
elchurnerista@reddit
your margin is my opportunity. which makes costs lower overall 😉
FullstackSensei@reddit
They need to build and maintain competency in building LLMs because they're the next trillion dollar market, yet nobody is making money selling them. Meta learned the hard way that keeping the weights from the public while seeding the models to get feedback is futile. So, might as well make them available for download and focus on maintaining competency.
While probably a smaller factor: there's also the need to maintain research into the field open, because even if you have the brightest researchers, you never know where the next evolutionary step in the technology will come from. So, it's in the interest of almost everyone to keep the research open while the field is still rapidly evolving. Everyone is better off having access to everyone's research until the tech plateaus.
No-Refrigerator-1672@reddit
In science, it's a typical situation that if you acquired govermental financing for your research (even partial one), then your results must be public. I'm sure that this accounts for at least a portion of free models out there.
badabimbadabum2@reddit
They wanna make chatgpt less relevant, and they want to have their own models in use and not stay out of competition. In the future it will change, so lets enjoy current free models.