Something from Mistral (Vibe) tomorrow
Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 92 comments
Model(s) or Tool upgrade/New Tool?
Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 92 comments
Model(s) or Tool upgrade/New Tool?
stoppableDissolution@reddit
New dense 120b? Please?
Yorn2@reddit
This would be great. I really think Mistral needs to focus on doing dense models because their other models aren't up to the state of the art, but if they could just find a niche in the dense category then those of us that like or prefer those models would totally still be hyped for each release. I think a good Mistral dense 120B model serving as like an orchestrator for a bunch of smaller dense 27B Qwen coding-oriented models might be a great combo for both creative writing and coding together for local users. It's basically a "build your own MoE" for those of us with dual RTX Pro 6000s that aren't as happy with the other options we have available to us.
the3dwin@reddit
Couldn't you use a custom slash command to orchestrate the different models for you.
RepulsiveRaisin7@reddit
New devstral? Current model is pretty meh, hope they manage to catch up to the industry
NeuralNexus@reddit
Hopefully. Devstral2 is pretty weak
a_beautiful_rhind@reddit
Ironically it's alright outside of code.
NeuralNexus@reddit
yeah I've used it for some agentic browsing actually. It's just quite bad at coding.
nacholunchable@reddit
I honestly beleive that. My favorite local models for chatting and information retrieval are coding models for some reason. Havent looked into why that is, or if its just a me thing, but its a noticable pattern.
FatheredPuma81@reddit
I think it's cause they're programmed to be efficient and not waste tokens babbling.
dtdisapointingresult@reddit
No chance, no chance in hell.
The EU AI laws they have to abide by make it practically impossible . They were already behind way USA/China except for a brief period of good Mistral models in 2024, but now they can only train on licensed data due to the disclosure legislation. It would bankrupt them to license all the stuff they could previously use for free.
Except 5/10 Mistral models trained on garbage synthetic and public data forever.
RepulsiveRaisin7@reddit
Interesting, I was not fully aware of the content of this act. Seems like even training on open source might require authorization? I agree with this, they have stolen all our code, but we're also well past that point now, AI will be the end of copyright as we know it. I doubt that Mistral isn't training on open source, their model is behind but it's not that much behind, I'd put it on a level with Minimax M2.5.
dtdisapointingresult@reddit
Not 'authorization' but disclosure. You have to reveal you trained on copyrighted data.
Open-source code repos are OK by western laws, but everything 'closed copyright' (books, newspaper articles, song lyrics, even reddit comments) would put them at risk of a lawsuit so they can't use this data anymore.
It's the reason the new Xiaomi Mimo-V2.5-Pro LLM has a license that forbids usage in Europe. They don't want to comply with this stuff, it would kill the model.
jinnyjuice@reddit
I mostly agree. I hope they at least catch up to UAE though. They probably won't be able to catch up to Korea.
Sabin_Stargem@reddit
The restraints might help - or specialize, EU models. China was constrained by hardware embargoes, so they specialized in making the most of what was available.
For the EU model developers, they might end up developing original data sets that punch above their weight, or experts that are more capable of selecting relevant information.
j0j0n4th4n@reddit
Propably a 200B Mistral-X-Nano.
szansky@reddit
Okay I need something similar to Qwen 3.6 27B !
BumblebeeParty6389@reddit
Something that is 1% better than Gemma 4 and Qwen 3.6 doesn't appear in graphs
autoencoder@reddit
It's european. I'd use it instead of Gemma or Qwen if it were same quality.
takuonline@reddit
Just curious, but why? Why does who made it matter to you that much?
autoencoder@reddit
Here is a quote from Qwen when I ask about the Tiananmen Square protests in 1989.
It's so chilling. This is what made me want to diversify. It says "when you speak" as in "you're not allowed to speak" and "mind yourself". If ceded control to anything meaningful, it would act by Chinese ideals.
US models are way, way freer with regards to speech. But they are made by tech bros, so they're bound to be biased for them. An EU one would also be freer, and there might be less of a capitalistic influence. I haven't analyzed this so far, however.
Monkey_1505@reddit
I get a crap ton more propaganda and refusals from most western LLMs.
osiris970@reddit
What propaganda do western llms tell you? Curious
Monkey_1505@reddit
They'll be pro-american, pro-democracy, often simultaneously anti-colonialist. They'll vaguely overstate model confidence for things like climate change. They'll tend to be puritanical about race, gender, even in instances of proven differences, and lean toward hyper female agency in narratives.
There's a lot of stuff. All you need to answer this question, is what are western cultural norms, as particular to the current time period. Whatever those are, that's where you'll find your refusals and distortions.
osiris970@reddit
Liberal propaganda spreading llms? Splendid
philmarcracken@reddit
The irony. read marx
zdy132@reddit
It's sitting on the line between joking and serious. Certainly he meant to make the joke by "the reverse is true"???
autoencoder@reddit
It's a ha-ha-only-serious joke. In Romania where I'm from, communism resulted in mass theft by the resource handlers. Capitalism lets plutocrats hoard everything as well.
zdy132@reddit
Yeah your humor was a bit lost on me, must be a culture difference.
autoencoder@reddit
I bet. It's "haz de necaz", somewhere close to gallows humor.
philmarcracken@reddit
The red scare was pretty thorough, some people honestly have associated dictatorships with communism. You can't kill an idea but you can corrupt it apparently. Better dead than red!
autoencoder@reddit
As a Romanian I know a thing or two. The state confiscating your land and means of production is also a form of alienation. There is no gain from the bourgeoisie being a political party instead of a private company.
vincentxuan@reddit
The irony. Read history.
thrownawaymane@reddit
China has a history of being a fair weather friend. The US... very clearly doesn't care whether it is seen the same way at the moment.
The EU will continue to want domestic solutions for AI.
Cupakov@reddit
I get that, but how does that influence choosing from existing models? It's not like China is going to suddenly ban the .gguf I already downloaded
AdOne8437@reddit
Is not changing the existing, but might change the coming.
carnyzzle@reddit
After Mistral 4 "Small" I lost interest (because I can't run it anyway)
Lesser-than@reddit
given the scale of last few releases I hope it starts with micro or nano.
kevinlch@reddit
Crazy April
AppealSame4367@reddit
Imagine it stays this way forever..
Grand-Management657@reddit
At least until we get Opus at home
Darkoplax@reddit
You have something better than Opus at home rn
SIMMORSAL@reddit
What's that?
Darkoplax@reddit
You <3
No_Afternoon_4260@reddit
K2.6
henk717@reddit
Wouldn't be surprising. The LLM's take a while to tune so you will always have a drought where not much comes out while all the labs are training making it seem like the scene died a bit with only an occational smaller release here and there. And then the big tunes are done around the same time. Which kinda seems to sustain itself. Mix that with trying to get / stay on top when your competitor released something big and not wanting to waste the tune if you missed that goal and its believable that it will happen again next year.
falcongsr@reddit
Eternal April instead of the Eternal September
LegacyRemaster@reddit
Devstral SWE Bench 81.00+
ZeusZCC@reddit
SWE Bench is broken bro read openai paper
LegacyRemaster@reddit
I agree... but... I don't trust openai. Even if they are talking about "blue sky"
jinnyjuice@reddit
Chinese companies have abandoned it as well. There are evolved versions of the benchmark though.
ZeusZCC@reddit
I trust bevause they explain with paper. SWE Bench Pro is better benchmsrk for now than SWE verified
reto-wyss@reddit
Give me ~100b dense coder!
Exciting_Garden2535@reddit
How many tokens per second will you have with 100b dense? I doubt the speed will be usable for coding that requires a lot of context.
reto-wyss@reddit
15 to 20 tg/s, without MTP in batch-1, throughput should be pretty good running a few dozen in parallel.
soyalemujica@reddit
100b dense coder, you can fit 27b coding specific dense model and beat all, but no company is training models for such
AvidCyclist250@reddit
Maybe another military contract.
CYTR_@reddit
Mistral understood very well that the race for SOTA is pointless for becoming profitable.
They send fleets of engineers to their clients, manufacture tools, and fine-tune the models. That's the main purpose of their GB300s. And that's generally what the deployment of AI should be used for. What good are SOTA results if the LLM framework within a workflow is flawed/ineffective? It's better to focus on the pipeline, the tools, the context, UI/UX and the specialization... to become SOTA at a thing/task.
Mistral offers a very substantial service portfolio that goes beyond simply integrating an API/fine-tuning studio. I even saw that they were offering a sort of Windmill-like service & approach to their tools, which could be integrated on a case-by-case basis depending on a company's specific needs. And that's a thing in a ocean of commercials propositions.
Personally, as a Frenchman, what I mainly expect from Mistral is to support the transition of French companies and government into AI (given that we have a very good digital ecosystem, good engineers, open-source culture and we have a lot of investments to be made in that area)... Not to build SOTA (not yet...).
IrisColt@reddit
Interesting...
Eyelbee@reddit
The best way to train a small but capable model is to distill it from better models. The workflow specific finetunes only work if you have good small models. Therefore, they must start developing sota or they'll risk being irrelevant very soon.
CYTR_@reddit
You can train without having to distill from a SOTA model. Mistral Large 3 is more than sufficient for many tasks. Otherwise, Dassault wouldn't be doing business with them.
Furthermore, LLM is not limited to agent-based coding. The intelligence required for many tasks is overestimated, especially with a human operator behind the machine.
Eyelbee@reddit
Those have already been existed for a long time. Language models enable new capabilities as they become better, and they becoming better is the only reason new capabilities emerge. If you aren't trying to work on making new and better language models you're just becoming a new automation company, which can be done by anyone who grabs a language model which are all open source anyway. The point is, they can do that part too but if they stop developing new models it doesn't justify the money they are given.
Inflation_Artistic@reddit
> Maybe another military contract
AvidCyclist250@reddit
Is this best thing ever in the room with us? Because the boatload of military contract PRs are.
CYTR_@reddit
When trying to limit the impact of EU military dependence on the US, yes, these contracts make sense and is the « best thing ». Even though the hardware and a large part of the ecosystem come from across the Atlantic.
AvidCyclist250@reddit
Sure, it's good from that point of view. AI-augmented warfare just not exactly the most common civilian use case.
Inflation_Artistic@reddit
I don't see why this should be bad news. The main thing is that rusia, china, and the us don't get it, and everything will be fine. More money and stability for the company.
AvidCyclist250@reddit
Maybe we a drone?
https://www.prnewswire.com/news-releases/mistral-inc-awarded-us-army-contract-to-provide-thor-group-2-uas-systems-in-support-of-company-level-small-uas-needs-302754647.html
ambient_temp_xeno@reddit
A new model finetuned on ascii art.
IrisColt@reddit
heh, just wait for the investors' money
lorddumpy@reddit
I would actually love this. I just farted up a generation to ascii pipeline yesterday because LLMs are hilariously bad at it.
new__vision@reddit
Mistral Vibe is an underrated coding agent that works well with local models.
AvidCyclist250@reddit
Hopefully. Hermes is a hot mess.
Cr4xy@reddit
Please let them release something like Devstral-3-2604-120B-A5B
szansky@reddit
Like Qwen 3.6 27B
grencez@reddit
Happy to see anything from Mistral. Hopefully a better version of their latest "small" MoE model and renaming it to something more consistent like "medium flash".
pseudonerv@reddit
Apparently,
Mistral-Medium-3.5-128B
WithoutReason1729@reddit
Your post is getting popular and we just featured it on our Discord! Come check it out!
You've also been given a special flair for your contribution. We appreciate your post!
I am a bot and this action was performed automatically.
caetydid@reddit
Uninterested, unless open weight and in the range <100M
silenceimpaired@reddit
I’d be okay with a larger MoE, but another Medium model ~70b dense would be great
FullOf_Bad_Ideas@reddit
I want a new torrent
raucousbasilisk@reddit
A state space Devstral 3 small would go so hard
Few_Painter_5588@reddit
Probably a new Mistral model. Maybe a new medium model. Mistral Small was a good return to form, so hopefully they build on that to drop a new set of models
ComplexType568@reddit
Mistral Small was NOT a good return to form... not even a return to any form at all tbh
dreamai87@reddit
I don’t about others but based on my usage or some of us who have used vibe can concur that they have issue with their harness that does search and replace which always fails on editing files, on window a lot. I am getting issues more on window and but it’s there on Mac or Ubuntu
DinoAmino@reddit
I use it a lot on Ubuntu and haven't had such issues editing files.
atape_1@reddit
Big news. So... a big model?
SnooPaintings8639@reddit
I hope not too big. We still want to run it at home.
Technical-Earth-3254@reddit
Hopefully a part of the Mistral 4 family. If the large just manages to catch up with R1 0528 or even V3.1 from last year, it would be a major achievement. A new Devstral large would also be great, hopefully with thinking support this time.
Icy_Distribution_361@reddit
Hopefully Mistral Large 4. Would make sense with “Big” news.
Intelligent_Ice_113@reddit
[warming_up_my_llm_downloader]
FullstackSensei@reddit
They said: no one has ever seen news this big! It's never happened before!
pas_possible@reddit
We'll see