Meta Releases Muse Spark - A Natively Multimodal Reasoning model

[-]

darkpowerxo@reddit

Huggingface link ? On we have to pay per month ?

[-]

Looks like it's very bad at abstract reasoning puzzles, but other than that it's a frontier model. This is definitely not a small model. It's most likely the size of Kimi K2.5 if not even bigger, so if you can't run Kimi K2.5, you're not really missing out if this model never gets released on Huggingface.

[-]

bwjxjelsbd@reddit

Is it? I mean the speed of how fast it generating response gives me the medium size model vibe.
Not huge Opus or Gemini 3.1 Pro size but it's insanely smart for the speed

[-]

Zanion@reddit

Looks like it's very bad at abstract reasoning...

Ah of course, exactly the axis I most want for my reasoning models to be underperforming.

[-]

ortegaalfredo@reddit

Elon just posted they are training a 10T model.

[-]

Real_Ebb_7417@reddit

I wouldn’t trust what he says until I see it. He likes to talk.

[-]

Ok_Technology_5962@reddit

I would also say that the model becomes lazier and doesnt want to do anywork

[-]

Cool-Chemical-5629@reddit

I just tried this model through their official chat website and I'm starting to believe they aren't kidding about its capabilities... If you ask it to create a single HTML page game, you will be probably surprised because this AI creates its own graphics assets like textures and characters. I was like What?! This is insane... Well there were couple of issues, the NPC enemy it created had static background, but when I asked it to fix it, it actually regenerated the NPC sprite and used proper transparency so that the result was really just the character itself without background so it perfectly fit into the game world created using ThreeJS. Fully textured 3D dungeon with interesting spot lights here and there to simulate torches, skeleton enemy, simple but pretty game user interface, overall retro look just like I love it. I really recommend trying this thing out.

Unfortunately, I don't think the model itself is what handles the entire thing alone, it's probably a set of agents that work autonomously to piece this project together. I've never seen a single model that would work as both LLM and image generator, but who knows what did they cook behind the scenes...

[-]

bwjxjelsbd@reddit

Are you running 'contemplating mode'?

[-]

Budget-Juggernaut-68@reddit

Those values look mid?

[-]

JockY@reddit

Released? I don’t think that word means what you think it means.

[-]

SporksInjected@reddit

They released benchmarks? 🤷

[-]

JockY@reddit

I can’t figure out how to load the benchmarks into ollama on windows with my 1080 Ti.

[-]

SporksInjected@reddit

Benchmarks_q4.GGUF

[-]

chettykulkarni@reddit

They need data to advertise more stuff 🤦

[-]

Klutzy-Pace-9945@reddit

This seems as an interesting update to me but still curious is it available for public now

[-]

lemon07r@reddit

So it's a closed model, thats about as smart as GLM/Kimi, from the looks of it. That makes it kind of like the qwen plus models that dont get their weights shared. Decent, but who's it for? If it's a closed model why use it over better closed models, or cheaper open weight models (since they always get more provider choice).

[-]

Few_Painter_5588@reddit

Well it's unfortunate that they're not making any openweight releases, though rumours suggested they were working on some openweight models. One thing that's very apparent here though, xAI has fallen behind significantly.

[-]

GoranjeWasHere@reddit

IT didn't fall. You get now agentic reasoning by default with Grok. Outputs got a lot better.

Also it is still Grok4 which was released last year. Grok5 is supposed to be their next frontier model relesaed in a month or two with mega improvements.

Secondly, Grok is by far the least censored model out of all frontier models. I have no doubt that in those benchmarks they remove outputs where AI refused to generate answer where Grok just trailed along no problem.

[-]

Plabbi@reddit

Grok has a huge 2,000,000 token context window, so at least they have that going for them.

[-]

Real_Ebb_7417@reddit

Well, they can add a huge context because xAI is the only lab at the moment that has a real ai datacenter (500k Nvidia GPUs if I recall). Other labs are still building them.

But it doesn’t matter much, because there is no use of such big context if model hallucinates like crazy and is just dumber than other models with smaller context xd

[-]

lambdawaves@reddit

OpenAI and Anthropic don’t have AI data centers? How do you know this?

[-]

Real_Ebb_7417@reddit

They definitely do. Just not as big as the ones that are being built now (incomparably smaller than the xAI one)

[-]

Adventurous_Pin6281@reddit

might as well be infinite context.

[-]

Spara-Extreme@reddit

Alphabet has a lot of AI compute available.

[-]

MerePotato@reddit

Just because they claim a 2 mil context window doesn't mean that's anywhere near the effective context limit

[-]

Thedudely1@reddit

I had a long running conversation with Grok spanning multiple weeks of following the stock market and after about a month it just completely hallucinates the date and data and cannot even be corrected once you try to correct it. Had to abandon that conversation. It was definitely less than 1 million tokens, as I was only sending about one message per day for about 30 days. And this was using "expert".

[-]

Sir-Draco@reddit

It’s not really a plus. The only models that have been proven to actually be able to do anything with a larger (1M) context window is Opus 4.6 and Sonnet 4.6, with GPT 5.4 coming in closely behind

Go use a grok 2M context window for anything other than just messing around and that will become clear

[-]

agentcubed@reddit

Insane that they're back in the AI race. It's hilarious looking at the charts and seeing them jump from last place to 4th. SOTA is now back to the original 4.

Nonetheless, dumb plan. They're so behind on the AI race nobody will actually try their models. The only reason they were in the AI race is that they had open weight models.

What they should've done is release a smaller open weight model that ties Gemma or Qwen, then once they're in the good graces again, release a bigger model.

[-]

fastcrw@reddit

where can we try? or api?

[-]

Linkpharm2@reddit

Oh hey, llama 9.

We do not talk about llama 5-8

[-]

lordchickenburger@reddit

Meh

[-]

MrMisterShin@reddit

I wonder how it compares to Qwen3.5

[-]

gizcard@reddit

Meta releases blogspot about the model

[-]

KeikakuAccelerator@reddit

You can use it in meta AI app I think. No open weight and API is private. Though I saw reporting that they are gonna have some future releases which are open source

[-]

Appropriate_Car_5599@reddit

well, I simply can't trust them 😁 so no hope for this release

[-]

Cool-Chemical-5629@reddit

I think the model has a good sense of humor!

In the game it created for me, there was an NPC named Elder Mara. She wanted me to bring some artifact to her or destroy it and the choice will have some consequences (can't recall what exactly), but what really caught my eye was that there was an option for me to ask "Why me?", I couldn't help and clicked it and she said "Because you're still asking why. Others stopped long time ago." 😂

[-]

BagComprehensive79@reddit

Is there any news about will it be open weight or smaller open weight version?

[-]

No-Manufacturer-3315@reddit

Not local

[-]

ortegaalfredo@reddit

After the latest Llama flops, quite incredibly they managed to do a competitive model, I mean it's even better than Opus, quite incredible. Imagine if they had released it as llama 5 it would have destroyed everything else.

[-]

Ly-sAn@reddit

Better than Opus is a big flex, let’s see how it behaves outside of benchmarks.

[-]

RickyRickC137@reddit (OP)

The company also said that it has larger models in development and hopes to open-source future versions.
Source

[-]