TransPixar: a new generative model that preserves transparency,

[-]

justalittletest123@reddit

No way, this is absolutely amazing!

Reply

[-]

jiahaooo@reddit

Impressive, it’s perfect for generating game assets

Reply

[-]

Lost_Cyborg@reddit

seems to be too early for that. Resolution is too low.

Reply

[-]

UnkarsThug@reddit

Alternatively, we need to lower the resolution even further, so it can do pixel art.

Reply

[-]

NotRandomseer@reddit

Lowering the resolution gives shit pixel art, you need something trained on pixel art

Reply

[-]

You CAN do some cool stuff by trying different scaling methods and dithering in post! Esp by going far smaller than you need, then re-expanding. Turning off any scaling algos or optimization so it just purely scales up big square pixels. I’m describing it badly but it’s a fun technique for pixelating stuff

Reply

[-]

UnkarsThug@reddit

Yes. Trained on low resolution pixel art.

Reply

[-]

fullouterjoin@reddit

Real artists ship

Reply

[-]

MoffKalast@reddit

*salesman slaps roof of half unfinished game* Ship it!

Reply

[-]

Colecoman1982@reddit

Does TransPixar not already let you set the final resolution for the content it creates?

Reply

[-]

syrupsweety@reddit

the duality of man: https://preview.redd.it/g2n8aheh05ce1.png?width=1152&format=png&auto=webp&s=ef6a17066829b5e9758df74ada83dba129afe219

Reply

[-]

umarmnaq@reddit (OP)

Github: [https://github.com/wileewang/TransPixar](https://github.com/wileewang/TransPixar) Arxiv: [https://arxiv.org/abs/2501.03006](https://arxiv.org/abs/2501.03006) Demo: [https://huggingface.co/spaces/wileewang/TransPixar](https://huggingface.co/spaces/wileewang/TransPixar) Model: [https://huggingface.co/wileewang/TransPixar](https://huggingface.co/wileewang/TransPixar)

Reply

[-]

troop99@reddit

the demo only says "The requested GPU duration (300s) is larger than the maximum allowed"

Reply

[-]

vTuanpham@reddit

He has a hf subscription

Reply

[-]

umarmnaq@reddit (OP)

Strange... it's working for me

Reply

[-]

troop99@reddit

Try it on another device or with private tab, its still the same for me unfortunately

Reply

[-]

Journeyj012@reddit

lmao the username is wilee wang

Reply

[-]

big_ass_grey_car@reddit

Strange they chose to include a billion-dollar animation studio’s trademark in their name

Reply

[-]

auradragon1@reddit

Developers are not good at naming things.

Reply

[-]

YearnMar10@reddit

But they could have so an LLM for a good name :)

Reply

[-]

FaceDeer@reddit

There are only two hard things in Computer Science: cache invalidation and naming things.

Reply

[-]

Soft_Importance_8613@reddit

There are only two hard things in Computer Science: cache invalidation, naming things, and off by one errors

Reply

[-]

llamabott@reddit

Actually, there are four hard things in-- never mind.

Reply

[-]

YearnMar10@reddit

But they could have so an LLM for a good name :)

Reply

[-]

Colecoman1982@reddit

Gotta hit that lawsuit quota...

Reply

[-]

10minOfNamingMyAcc@reddit

And Disney's for some reason... /j

Reply

[-]

big_ass_grey_car@reddit

what?

Reply

[-]

10minOfNamingMyAcc@reddit

"Trans" but eh, it's a shitty joke.

Reply

[-]

big_ass_grey_car@reddit

So you’re transphobic and an asshole, got it. You knew it wasn’t funny, but the 14 year old edgelord in you just couldn’t resist.

Reply

[-]

10minOfNamingMyAcc@reddit

👍

Reply

[-]

pooppooppoopie@reddit

Has a bunch of uses immediately

Reply

[-]

AssistBorn4589@reddit

That's quite cool. It can do images, right?

Reply

[-]

searcher1k@reddit

layer diffuse can do transparent images since for at least a year: [GitHub - lllyasviel/LayerDiffuse\_DiffusersCLI: LayerDiffuse in pure diffusers without any GUI](https://github.com/lllyasviel/LayerDiffuse_DiffusersCLI)

Reply

[-]

TheDailySpank@reddit

Not sure about this model yet, but BEN (background eraser network) is really good at masking backgrounds away from images.

Reply

[-]

AssistBorn4589@reddit

BEN is something else, it takes existing image and attempts to detect what is background. It often erases bit too much or needs to be fixed manually and it's not much better than tool already integrated to Krita.

Reply

[-]

TheDailySpank@reddit

True. Different tools have different uses. Where I think BEN excels is in wispy gradient shit like hair (I've been working with a lot of hair lately). It's the only one that gets it to the quality I need consistently. I also use traditional segmentation pipelines when working on more complex masking setups or just plain ol' REMBG (when I need something fast). I do a lot of photogrammetry and 3DGS and these segmentation/masking tools have saved me countless hours of manual labor even compared to the initial learning curve.

Reply

[-]

Eralyon@reddit

If you need still images with transparency, SD Forge does it with a plugin. (I forgot the name of it) But I remember installing it through the interface using the github link, and it worked as soon as I understood how to use it... It was with SDXL models.

Reply

[-]

Any-Conference1005@reddit

[https://github.com/lllyasviel/LayerDiffuse](https://github.com/lllyasviel/LayerDiffuse)

Reply

[-]

Eralyon@reddit

\^\^ yes this one.

Reply

[-]

ThatInternetGuy@reddit

RemindMe! 3 months

Reply

[-]

bot_exe@reddit

now this the kind of thing can be be used for specialized creative tools that artists will come to appreciate, at least those who have not been infected by the anti-ai mind virus.

Reply

[-]

maddogawl@reddit

I know its early, but dang, this is showing real promise. Nice work on this!

Reply

[-]

Former-Ad-5757@reddit

Am I wrong or is it just randomly ignoring the prompt in the demo video? If the prompt is "A forest floor being consumed by spreading magical fire" Then I would expect a forest floor somewhere. If the prompt is "Water splattering in mid-air" Then I would expect some air.

Reply

[-]

procraftermc@reddit

mid-air probably just means floating in the middle. it can't exactly portray an invisible gas after all.

Reply

[-]

Former-Ad-5757@reddit

Ask any other image or video model to portray air and it will portray something, this model (from the demovid at least) seems to just make the largest object transparent. It is impressive but it also seems difficult to get the wanted video, perhaps in a next run it makes the water transparent and it shows the air

Reply

[-]

Zealousideal-Cut590@reddit

This is rad. Can't wait for it to appear in a video editing software near you.

Reply

[-]

madaradess007@reddit

now this could be useful

Reply

[-]

YRUTROLLINGURSELF@reddit

Could? Can we drop the bullshit, we *all* know this stands to become the manhattan project of generative cum technology

Reply

[-]

mutes-bits@reddit

RemindMe! 2 months

Reply

[-]

RemindMeBot@reddit

I will be messaging you in 2 months on [**2025-03-09 15:37:47 UTC**](http://www.wolframalpha.com/input/?i=2025-03-09%2015:37:47%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/1hx7421/transpixar_a_new_generative_model_that_preserves/m68lfoh/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F1hx7421%2Ftranspixar_a_new_generative_model_that_preserves%2Fm68lfoh%2F%5D%0A%0ARemindMe%21%202025-03-09%2015%3A37%3A47%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201hx7421) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

Reply

[-]

parzival-jung@reddit

can’t this be used to make textures and stuff like that ? can it handle layers of transparency / opacity settings?

Reply

[-]

Fun_Yam_6721@reddit

this seems like it will help physics modeling

Reply

[-]

GammaScorpii@reddit

What res?

Reply

[-]

SgathTriallair@reddit

Was this something that was difficult for AI before? I haven't played enough with AI video to know what it is and isn't good at.

Reply

[-]

mikael110@reddit

Most AI models that process video and photo can only produce RGB output. To produce/maintain transparency they have to output RGBA. In simplified terms the reason for this is that adding an additional image channel that has to be processed adds complexity and processing work, regardless of whether the thing you are processing really needs transparency or not. And given that over 90% of images and video don't contain transparency, it makes sense that people training models would choose to exclude it.

Reply

[-]

ApplePenguinBaguette@reddit

It couldn't do it at all as far as I'm aware

Reply

[-]

sunshinecheung@reddit

wow

Reply

[-]

Roth_Skyfire@reddit

That's actually super useful.

Reply

Reply to Post

58 Comments