TheaterFire

Llama may release new reasoning model and other features with llama 4.1 models tomorrow

Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 71 comments

Llama may release new reasoning model and other features with llama 4.1 models tomorrow

Reply to Post

71 Comments

no_witty_username@reddit

My guess is that the meta team is currently playing around with qwen trying to figure out if their llama model can compare. If not, they might postpone the release...
View on Reddit #54958922

carnyzzle@reddit

I'll only care if they release models that can run on a single 24GB card lol
View on Reddit #54902723

Accomplished_Stay337@reddit

Amen. Make that 16gb card brother.
View on Reddit #54950360

jacek2023@reddit

I don't agree with people hating Llama 4, it's very useful as MoE, you can build computer with low VRAM and still get some t/s. I am waiting for Llama 4.1 and expect much improved models!
View on Reddit #54902066

Expensive-Apricot-25@reddit

its also more designed for industrial use cases. not so much for hobbyists. High memory usage, but very low compute + high parameter count = very good for industrial uses
View on Reddit #54903720

StyMaar@reddit

The only “industrial use” they care about is their own, though.
View on Reddit #54911269

TheRealGentlefox@reddit

Why do their intentions matter here? They open-weighted a model that works well for industrial use.
View on Reddit #54943266

Expensive-Apricot-25@reddit

sure, its also useful for other companies too. there was no obligation for them to release something for hobbyists.
View on Reddit #54915458

anilozlu@reddit

low vram? why?
View on Reddit #54939083

Soft-Ad4690@reddit

I remember the original llama 3 also having issues, particular with non-english prompts, but that was completely fixed in 3.1. I hope the same is true for llama 4.
View on Reddit #54933753

Mobile_Tart_1016@reddit

Qwen 3 will outperform them so thoroughly that I think within a week or two, everyone will have forgotten about Llama 4.
View on Reddit #54928289

lily_34@reddit

Indeed. On live-bench, amongst non-reasoning open-weights models, Maverick is second after Deepseek v3. But it's smaller and faster, so it's kind of expected to be slightly worse.
View on Reddit #54917428

ThenExtension9196@reddit

Nah. For a company as big and resource laden as meta, this was a weak offering which clearly shows a break down in their management or strategic focus.
View on Reddit #54907529

Serprotease@reddit

It’s mostly because the release was done very poorly. Trust matters. Scout don’t seems have captured a lot of interest, mostly because of Gemma 27b that is easier to run and better/equal to it. But Maverick did. It’s seems to be quite good for older/ddr4 server build. It’s roughly similar to a dense 70b, but faster. (And we did not have any good 70-120b models for quite some time. Command-a did not really pushed the boundaries.)
View on Reddit #54903779

kweglinski@reddit

there are people in limbo like me. 96gb v ram but not super performant (m2 max). Where gemma3 doesn't quite cut it with speed. While t/s is not big deal difference 20 vs 30t/s (although noticeable) but the PP is drastic, sadly I don't have numbers at hand. Can't run maverick so llama4 is the best for my usecases currently. It's also actually pretty good, in some cases I'd say it's better than gemma or mistral small (which btw. I find better than gemma, except for pure language skills)
View on Reddit #54905218

pseudonerv@reddit

We hope. We cope.
View on Reddit #54903536

DarKresnik@reddit

Llama is not Open Source.
View on Reddit #54901146

ColorlessCrowfeet@reddit

It's [clopen](https://en.wikipedia.org/wiki/Clopen_set)? (Wikipedia, "both [open](https://en.wikipedia.org/wiki/Open_set) and [closed](https://en.wikipedia.org/wiki/Closed_set)")
View on Reddit #54901415

Calcidiol@reddit

Just wait until we have quantum computing models then there'll be schrodinger's clopen weights and never even having certainty whether they're open and local, closed and local, open and cloud, or closed and cloud since the closer you look for locality the less you know about open.
View on Reddit #54940482

eras@reddit

I think [open weights](https://www.agora.software/en/llm-open-source-open-weight-or-proprietary/) is a good term.
View on Reddit #54901897

StyMaar@reddit

> because there are limitations on how you can use those weights. There is a piece of text that claim that they have ownership on the weight and that they are giving you a license and you have to adhere to it. There's no legal basis for that at the moment as model weights aren't copyrighted material under any jurisdiction now. This is just an attempt to claim a new kind of IP right, and shall be disregarded (and I mean that not only because you shouldn't care, but because you shall refuse to care to stop them from being able to convert that attempt into an actual IP law in the coming years).
View on Reddit #54911822

eras@reddit

Yet there really is legal basis that it is not copyrighted material under any jurisdiction? In any case, I don't think I would characterize it very open when using it against their terms could result in a long legal battle—if Meta cares enough about it. Certainly it could be a dangerous endeavour for small businesses.
View on Reddit #54912352

StyMaar@reddit

> Yet there really is legal basis that it is not copyrighted material under any jurisdiction? Copyrighted material is a term that is legally defined, and the current definition excludes model weights (as much as it excludes compiled artifacts: you cannot take someone else's code, and claim intellectual property over the compiled binary just because you compiled it by yourself). > In any case, I don't think I would characterize it very open when using it against their terms could result in a long legal battle—if Meta cares enough about it. Certainly it could be a dangerous endeavour for small businesses. I really doubt so, this “license” only works because there's ambiguity and losing a trial would end up destroying the ambiguity they are building upon. If you are in the US, you'd have very good reasons not to do this since thanks to the broken legal system they can sue you into bankruptcy, but if you are in any country with a sane justice system, you'd be fine. They aren't gonna sue anyone in the EU who uses their model in complete violation of their license even if it's publicly advertized.
View on Reddit #54922318

eras@reddit

One could argue that it is a compilation, though? And then also argue that _creative imagination_ was used when building configuring the system that converted that material into the resulting LLM; after all, we can see that the capabilities of these systems vary a lot, while we can hypothetize that it is not only about their training material or parameter count that make the difference. So there's something else in play as well. Truth is nobody has tried these in court yet (all the way through). We'll see what the NY Times lawsuit against OpenAI brings: if OpenAI loses, then it could mean a lot of these models would become legally undistributable.
View on Reddit #54928241

StyMaar@reddit

> One could argue that it is a compilation, though? Good luck claiming IP on a compilation from pirated material. It has to do *something* else. > And then also argue that creative imagination was used when building configuring the system that converted that material into the resulting LLM You can try compiling GNU software with a handmade compiler of your own (surely writing a C compiler requires creative imagination too), then release it with a proprietary licence and see how it goes. I'm not going to bet on your side. > Truth is nobody has tried these in court yet (all the way through). We'll see what the NY Times lawsuit against OpenAI brings: if OpenAI loses, then it could mean a lot of these models would become legally undistributable. This is the other side of the equation though, Meta/OpenAI could win their trial with their “it's fair use” argument and it still wouldn't make the model itself copyrighted material. These trials annoy them very much, because it's going to remove the ambiguity and they have a lot to lose, but they didn't chose to start it. No way they *initiate* a trial on the other side. They are just betting for a *“fait accompli”* with their licensing claims: after long enough of a shared industry coutume of adhering to model licences, they would end up having a *de facto* legal value that the judge will abid to (unless the legislator itself codified it litteraly in the law). That's why we collectively **must not** show any regard to these claims.
View on Reddit #54933745

Former-Ad-5757@reddit

The big companies almost can't start a legal case, as they would almost certainly be asked to show their training data. And there are some very good reasons that the big companies won't be able to show their training data for the next couple of years. It starts to get very interesting if somebody has a big gplv3 code base, which is part of the training data and they ask a model about their own code base, but the in between model is not open source...
View on Reddit #54931821

ColorlessCrowfeet@reddit

Clearly that means "clopen weights". (I'm only half serious.)
View on Reddit #54906009

silenceimpaired@reddit

Doesn't stop them from saying it is.
View on Reddit #54901394

DinoAmino@reddit

DeepSeek also makes this false claim on their web site. Truth is none of them are open source - not any from the big players. None of the datasets and training recipes for these models are released.
View on Reddit #54934746

silenceimpaired@reddit

And I see so few on localllama who care. As long as they have Apache 2 or MIT I'm good. I don't have the compute to repeat what they did or the money. As long as I have the freedom to use it without restriction and modify it I'm happy, but I sympathize with those who want and can do more.
View on Reddit #54935439

qnixsynapse@reddit

Wasn't they advocating for [open source](https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/) last year?
View on Reddit #54906334

DarKresnik@reddit

There are other limitations like regional, the EU is excluded.
View on Reddit #54902087

ApprehensiveAd3629@reddit

8b and 13b models again please!
View on Reddit #54902636

MoffKalast@reddit

What's that? 80B and 130B? Sure thing.
View on Reddit #54937843

pmttyji@reddit

Hope they release something for 8GB VRAM too - Poor GPU Club
View on Reddit #54929582

Naurim@reddit

https://preview.redd.it/vna7kgrlcmxe1.png?width=1122&format=png&auto=webp&s=2ee6720d9c97ada65314c7dac06f9d9d99422a97 In light of today's events, this joke is getting funnier and funnier
View on Reddit #54925112

silenceimpaired@reddit

Isn't all news from Llama 4 BREAKING news? Or should I say, broke? I hope this new news is that they fixed Llama 4 Scout to clearly out perform Llama 3.3 70b.
View on Reddit #54901247

kweglinski@reddit

why would it do that? if it matches 70b that's a celebration. I know, i know 100+b param size, on the other hand it's 50% faster than gemma3.
View on Reddit #54905430

silenceimpaired@reddit

Not sure what your question is. Why would it do what? At present in my experience Llama 4 Scout acts around a 40b model with occasional jumps above Llama 3.3 70b, but it is not enough for me to toss aside 70b. Why would it do that? If they continued to train Behemoth and did further model distillation on Llama 4 Scout off of it, it has the potential to increase in performance. As I recall, and perhaps faultily, Llama 3.3 was distilled from 400b model with similar performance as a result. In theory, I would say Scout could be trained about the same amount of time as it has been... with the finished Behemoth model and easily out shine 70b... but I'm no researcher, just have "more than a feeling" when I see the quantization having no harm to the model above 4 bit.
View on Reddit #54905781

kweglinski@reddit

why would it outperform 70b. rule of thumb for MoEs is usually sqrt(params*active params) so scout was aiming at very fast 30b and it delivered as you've said. I doubt it would change a week or so later as drastically to 70b. And your comment says "fix", that would be major breakthrough not fix.
View on Reddit #54922883

__JockY__@reddit

BREAKING!! Meta "may potentially release" a flock of birds into the auditorium. Meta "may potentially release" a 1B model that beats Gemini Pro. Meta "may potentially release" warez of that one Wu-Tang album. The feck outta here with "may potentially". Fekkin influencer nonsense 🙄.
View on Reddit #54920361

jnk_str@reddit

Hopefully they open source the UI too 🫠
View on Reddit #54916060

jnk_str@reddit

As an excuse for llama 4
View on Reddit #54916083

Worldly_Expression43@reddit

Great' except it uses Llama 4.
View on Reddit #54913519

OkActive3404@reddit

this week is the week for opensource models, qwen 3 today, llama 4.1 tomorrow, and deepseek r2 most likely later this week too
View on Reddit #54900815

2TierKeir@reddit

> deepseek r2 most likely later this week too I've heard this every week for the last like 4-5 weeks now...
View on Reddit #54912137

silenceimpaired@reddit

I hope Llama 4.1 - right now Scout is very underwhelming. It isn't even Whelmed. ;(
View on Reddit #54901363

Content-Degree-9477@reddit

Qwen 3 today, Llama 4.1 tomorrow and Deepseek R2 probably in couple of days. What a week we're living in!
View on Reddit #54900900

Remote_Cap_@reddit

What a time to be alive!
View on Reddit #54901197

nullmove@reddit

And then people here will start crying for new model 7 days later, tops
View on Reddit #54901953

Remote_Cap_@reddit

How lucky we are to be spoilt
View on Reddit #54910985

silenceimpaired@reddit

I hope Llama 4.1 - right now Scout is very underwhelming. It isn't even Whelmed. ;(
View on Reddit #54901368

Trysem@reddit

Why they aren't making an omni model?
View on Reddit #54908927

Luston03@reddit

I hope it won't be disappointment
View on Reddit #54908763

Ok-Recognition-3177@reddit

I have more hope for Deepseek and Qwen right now Llama 4 lost my trust and interest, the way they tried to manipulate benchmarks
View on Reddit #54907576

power97992@reddit

it will be eclipsed by r2 and probably qwen 3... If u are using API or a webapp, u might as well just use gemini 2.5 flash or pro.
View on Reddit #54906382

jakegh@reddit

Considering R2 is also likely releasing this week, Zuck has a very small window to get any traction at all. R2 sounds like an absolute *monster* in cost savings alone.
View on Reddit #54906179

swagonflyyyy@reddit

Too little, too late. Meta. Better luck next year.
View on Reddit #54904439

Barubiri@reddit

Memory would be awesome
View on Reddit #54901029

nullmove@reddit

*Open-Sourcing* memory would be awesome
View on Reddit #54903065

Thomas-Lore@reddit

Memory like in chatgpt? Isn't that just a short txt file that gets attached to the thread? And the new one seems to be RAG on previous threads. Does not seem hard to implement if you really need it.
View on Reddit #54903849

nullmove@reddit

Yeah but it's likely they have a better than current open-source embedding model powering it.
View on Reddit #54904282

ZABKA_TM@reddit

What’s the point of posting something that “might” happen? Are you a Meta stock bagholder?
View on Reddit #54903788

Illustrious-Lake2603@reddit

Still waiting for CodeLlama2 😢
View on Reddit #54902544

wonderfulnonsense@reddit

On the flip side, they may not release a model. My guess is there's a 50/50 chance.
View on Reddit #54900611

Independent-Wind4462@reddit (OP)

But this is a major event so possibility is they may release new features and maybe smaller version of reasoning models ? We dk but hope they release something good and not like previous
View on Reddit #54902195

a_beautiful_rhind@reddit

Here is your behemoth (no relation). Sorry, API only, too unsafe.
View on Reddit #54900808

Few_Painter_5588@reddit

Llama 4.1 checks out if they release behemoth. They did that with Llama 3.1 when they released the 405B dense model
View on Reddit #54901613

sunomonodekani@reddit

If they don't have good models that fit a cheap GPU, they won't have done much.
View on Reddit #54901506

sunshinecheung@reddit

Llama4,1 vs Qwen3
View on Reddit #54900378

Namra_7@reddit

Qwen 3
View on Reddit #54900447