Peanut - Text to Image Model (Open Weights coming soon)

Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 14 comments

A new anonymous model debuts at #8 in the Artificial Analysis Text to Image Arena! Peanut’s weights are expected to be released soon, which would make it the leading Text to Image Open Weights Model.

Peanut is positioned to be the new leading open weights Text to Image model, surpassing Z-Image Turbo, Qwen-Image, and FLUX.2 [dev].

Further details (and weights) coming soon.

Source Tweet : https://xcancel.com/ArtificialAnlys/status/2051376297163854019#m

[-]

GreenGreasyGreasels@reddit

How can you even properly judge an image model workout a good ~~model card~~ 1girl collection ?

[-]

Synor@reddit

Well it better be fast or precise because it sure isn't looking great.

[-]

StartupTim@reddit

Is there a way to do OpenAI type of queries and endpoints to interface with this or other image models? All I've seen so far is comfyui, but can these be fired up with llama.cpp / vllm etc?

[-]

TheSlateGray@reddit

Look at the surgeon. Can you honestly say it's better?

What happened to his neck?

[-]

StrikeOner@reddit

i hope that it has at least 1t parameters so i can run it on my 17x6000 pro rig here.

[-]

pmttyji@reddit (OP)

It's Text to Image model. So probably 20-40B size.

[-]

StrikeOner@reddit

that would make it only 2-3x bigger then flux-1. sounds just about right for my rig i guess.

[-]

rdsf138@reddit

Better than FLUX-2 while Grok isn't even on the same league as others.

[-]

It might be even better. Looking at the examples, it seems like the API models they compared it to have a prompt enhancement pass - for the psychedelic rock prompt, Peanut literally had that text, while the others look like there was some reasoning done before, to plan the text. If this comparison is non-prompt-enhanced output vs prompt-enhanced output, that reflects positively on this new model.

[-]

Cautious_Assistant_4@reddit

Also it seems to be more creative given how to anime dude image is compared to other 3. And the signs on the taxi image weren't gibberish despite not being prompted. The coloring on the world map is nice too