Meta Releases Muse Spark - A Natively Multimodal Reasoning model
Posted by RickyRickC137@reddit | LocalLLaMA | View on Reddit | 46 comments
Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration.
darkpowerxo@reddit
Huggingface link ? On we have to pay per month ?
Cool-Chemical-5629@reddit
Looks like it's very bad at abstract reasoning puzzles, but other than that it's a frontier model. This is definitely not a small model. It's most likely the size of Kimi K2.5 if not even bigger, so if you can't run Kimi K2.5, you're not really missing out if this model never gets released on Huggingface.
bwjxjelsbd@reddit
Is it? I mean the speed of how fast it generating response gives me the medium size model vibe.
Not huge Opus or Gemini 3.1 Pro size but it's insanely smart for the speed
Zanion@reddit
Ah of course, exactly the axis I most want for my reasoning models to be underperforming.
ortegaalfredo@reddit
Elon just posted they are training a 10T model.
Real_Ebb_7417@reddit
I wouldn’t trust what he says until I see it. He likes to talk.
Ok_Technology_5962@reddit
I would also say that the model becomes lazier and doesnt want to do anywork
Cool-Chemical-5629@reddit
I just tried this model through their official chat website and I'm starting to believe they aren't kidding about its capabilities... If you ask it to create a single HTML page game, you will be probably surprised because this AI creates its own graphics assets like textures and characters. I was like What?! This is insane... Well there were couple of issues, the NPC enemy it created had static background, but when I asked it to fix it, it actually regenerated the NPC sprite and used proper transparency so that the result was really just the character itself without background so it perfectly fit into the game world created using ThreeJS. Fully textured 3D dungeon with interesting spot lights here and there to simulate torches, skeleton enemy, simple but pretty game user interface, overall retro look just like I love it. I really recommend trying this thing out.
Unfortunately, I don't think the model itself is what handles the entire thing alone, it's probably a set of agents that work autonomously to piece this project together. I've never seen a single model that would work as both LLM and image generator, but who knows what did they cook behind the scenes...
bwjxjelsbd@reddit
Are you running 'contemplating mode'?
Budget-Juggernaut-68@reddit
Those values look mid?
__JockY__@reddit
Released? I don’t think that word means what you think it means.
SporksInjected@reddit
They released benchmarks? 🤷
__JockY__@reddit
I can’t figure out how to load the benchmarks into ollama on windows with my 1080 Ti.
SporksInjected@reddit
Benchmarks_q4.GGUF
chettykulkarni@reddit
They need data to advertise more stuff 🤦
Klutzy-Pace-9945@reddit
This seems as an interesting update to me but still curious is it available for public now
lemon07r@reddit
So it's a closed model, thats about as smart as GLM/Kimi, from the looks of it. That makes it kind of like the qwen plus models that dont get their weights shared. Decent, but who's it for? If it's a closed model why use it over better closed models, or cheaper open weight models (since they always get more provider choice).
Few_Painter_5588@reddit
Well it's unfortunate that they're not making any openweight releases, though rumours suggested they were working on some openweight models. One thing that's very apparent here though, xAI has fallen behind significantly.
GoranjeWasHere@reddit
IT didn't fall. You get now agentic reasoning by default with Grok. Outputs got a lot better.
Also it is still Grok4 which was released last year. Grok5 is supposed to be their next frontier model relesaed in a month or two with mega improvements.
Secondly, Grok is by far the least censored model out of all frontier models. I have no doubt that in those benchmarks they remove outputs where AI refused to generate answer where Grok just trailed along no problem.
Plabbi@reddit
Grok has a huge 2,000,000 token context window, so at least they have that going for them.
Real_Ebb_7417@reddit
Well, they can add a huge context because xAI is the only lab at the moment that has a real ai datacenter (500k Nvidia GPUs if I recall). Other labs are still building them.
But it doesn’t matter much, because there is no use of such big context if model hallucinates like crazy and is just dumber than other models with smaller context xd
lambdawaves@reddit
OpenAI and Anthropic don’t have AI data centers? How do you know this?
Real_Ebb_7417@reddit
They definitely do. Just not as big as the ones that are being built now (incomparably smaller than the xAI one)
Adventurous_Pin6281@reddit
might as well be infinite context.
Spara-Extreme@reddit
Alphabet has a lot of AI compute available.
MerePotato@reddit
Just because they claim a 2 mil context window doesn't mean that's anywhere near the effective context limit
Thedudely1@reddit
I had a long running conversation with Grok spanning multiple weeks of following the stock market and after about a month it just completely hallucinates the date and data and cannot even be corrected once you try to correct it. Had to abandon that conversation. It was definitely less than 1 million tokens, as I was only sending about one message per day for about 30 days. And this was using "expert".
Sir-Draco@reddit
It’s not really a plus. The only models that have been proven to actually be able to do anything with a larger (1M) context window is Opus 4.6 and Sonnet 4.6, with GPT 5.4 coming in closely behind
Go use a grok 2M context window for anything other than just messing around and that will become clear
agentcubed@reddit
Insane that they're back in the AI race. It's hilarious looking at the charts and seeing them jump from last place to 4th. SOTA is now back to the original 4.
Nonetheless, dumb plan. They're so behind on the AI race nobody will actually try their models. The only reason they were in the AI race is that they had open weight models.
What they should've done is release a smaller open weight model that ties Gemma or Qwen, then once they're in the good graces again, release a bigger model.
fastcrw@reddit
where can we try? or api?
Linkpharm2@reddit
Oh hey, llama 9.
We do not talk about llama 5-8
lordchickenburger@reddit
Meh
MrMisterShin@reddit
I wonder how it compares to Qwen3.5
gizcard@reddit
Meta releases blogspot about the model
KeikakuAccelerator@reddit
You can use it in meta AI app I think. No open weight and API is private. Though I saw reporting that they are gonna have some future releases which are open source
Appropriate_Car_5599@reddit
well, I simply can't trust them 😁 so no hope for this release
Cool-Chemical-5629@reddit
I think the model has a good sense of humor!
In the game it created for me, there was an NPC named Elder Mara. She wanted me to bring some artifact to her or destroy it and the choice will have some consequences (can't recall what exactly), but what really caught my eye was that there was an option for me to ask "Why me?", I couldn't help and clicked it and she said "Because you're still asking why. Others stopped long time ago." 😂
BagComprehensive79@reddit
Is there any news about will it be open weight or smaller open weight version?
No-Manufacturer-3315@reddit
Not local
ortegaalfredo@reddit
After the latest Llama flops, quite incredibly they managed to do a competitive model, I mean it's even better than Opus, quite incredible. Imagine if they had released it as llama 5 it would have destroyed everything else.
Ly-sAn@reddit
Better than Opus is a big flex, let’s see how it behaves outside of benchmarks.
RickyRickC137@reddit (OP)
The company also said that it has larger models in development and hopes to open-source future versions.
Source
EmPips@reddit
Return of the King
MikeLPU@reddit
Tried it. It's unusable very censored biased trash. If this is the result why they canceled llama team, it's a shame.
Snoo_64233@reddit
Well well well
silenceimpaired@reddit
It’s not released in the context of LOCAL llama.