1-bit Bonsai Image 4B and Ternary Bonsai Image 4B Image Generation for Local Devices with just 0.93 GB and 1.21 GB respectively of Diffusion Transformer Footprint. So tiny!

Posted by Addyad@reddit | LocalLLaMA | View on Reddit | 11 comments

https://prismml.com/news/bonsai-image-4b

[-]

dtdisapointingresult@reddit

The 1GB is just for the diffusion model, it's actually 3.5GB-4GB for the combined weights including VAE and text encoder, so don't expect to trivially use this on a phone unless it's top of the line.

[-]

ANR2ME@reddit

but each one doesn't need to be loaded at the same time isn't 🤔 text encoder can be unloaded first before the diffusion model loaded.

[-]

Addyad@reddit (OP)

Hahaahah. you wish. I am testing it. It certainly has guardrails.

[-]

Jester14@reddit

Is this an ad? Couldn't be any more low effort. And shit was released last week and had a bunch of posts about it then.

[-]

Are you going to pay me brov? If so, I can write you a detailed report. The image summarizes the image generation capabilities and i have given the source of anyone is interested in checking it out. Yes it got released a week no one was talking about it. So I thought I could share. In this subreddit, there was 1 post when I searched "bonsai image". Sorry if you are having a bad day. You don't have to be an asshat.

[-]

Nnyan@reddit

If you need to be paid to out in the minimal effort then why post at all?

[-]

JavierJV@reddit

Very long title, just one promotional image taken from the website itself, and only a URL in the post body. No experience, you didn't try it, all you do is share a news item in an uninteresting way. You seem like a bot.

At least the post from 7 days ago: https://www.reddit.com/r/LocalLLaMA/comments/1togflk/prismml_just_released_binary_and_ternary_bonsai/ provided links not only to try it but also to Huggingface.

[-]

No-Marionberry-772@reddit

thanks for the info, Didn't hear about this one. They don't make it clear if its usable on other platforms.

Unfortunately you can't use the LLM Ternary model outside of Mac devices, which is unfortunate.

[-]

Addyad@reddit (OP)

They gave option for Windows and Linux https://huggingface.co/prism-ml/bonsai-image-ternary-4B-gemlite-2bit. Should work. But I'm new to diffusion models. So, I don't know much about how to make it run. I have tried their Bonsai 1bit text model. It was okayish but not so intelligent enough for basic tasks. Then I resorted to gemma4 E2B. I am currently running my own local LLM assistant based on that Gemma4. He is good for text, audio and image to text workload. I have connected him to Discord, word plugin using VBA, and stuff. But Gemma cannot generate image yet. So, I am thinking to add this image generation model so I can generate some images along with Gemma4. Both should fit in my 8GB VRAM.

[-]

M4GMaR@reddit

Low effort post. If you tried it then you should at least post an image of your own generations... The image you posted here comes directly from their website.