TheaterFire

TransPixar: a new generative model that preserves transparency,

Posted by umarmnaq@reddit | LocalLLaMA | View on Reddit | 58 comments

Reply to Post

58 Comments

justalittletest123@reddit

No way, this is absolutely amazing!
View on Reddit #45358255

jiahaooo@reddit

Impressive, it’s perfect for generating game assets
View on Reddit #45178605

Lost_Cyborg@reddit

seems to be too early for that. Resolution is too low.
View on Reddit #45186064

UnkarsThug@reddit

Alternatively, we need to lower the resolution even further, so it can do pixel art. 
View on Reddit #45192730

NotRandomseer@reddit

Lowering the resolution gives shit pixel art, you need something trained on pixel art
View on Reddit #45286539

Wickedinteresting@reddit

You CAN do some cool stuff by trying different scaling methods and dithering in post! Esp by going far smaller than you need, then re-expanding. Turning off any scaling algos or optimization so it just purely scales up big square pixels. I’m describing it badly but it’s a fun technique for pixelating stuff
View on Reddit #45335039

UnkarsThug@reddit

Yes. Trained on low resolution pixel art.
View on Reddit #45327090

fullouterjoin@reddit

Real artists ship
View on Reddit #45198023

MoffKalast@reddit

*salesman slaps roof of half unfinished game* Ship it!
View on Reddit #45220897

Colecoman1982@reddit

Does TransPixar not already let you set the final resolution for the content it creates?
View on Reddit #45201274

syrupsweety@reddit

the duality of man: https://preview.redd.it/g2n8aheh05ce1.png?width=1152&format=png&auto=webp&s=ef6a17066829b5e9758df74ada83dba129afe219
View on Reddit #45282456

umarmnaq@reddit (OP)

Github: [https://github.com/wileewang/TransPixar](https://github.com/wileewang/TransPixar) Arxiv: [https://arxiv.org/abs/2501.03006](https://arxiv.org/abs/2501.03006) Demo: [https://huggingface.co/spaces/wileewang/TransPixar](https://huggingface.co/spaces/wileewang/TransPixar) Model: [https://huggingface.co/wileewang/TransPixar](https://huggingface.co/wileewang/TransPixar)
View on Reddit #45173378

troop99@reddit

the demo only says "The requested GPU duration (300s) is larger than the maximum allowed"
View on Reddit #45185186

vTuanpham@reddit

He has a hf subscription
View on Reddit #45277591

umarmnaq@reddit (OP)

Strange... it's working for me
View on Reddit #45185573

troop99@reddit

Try it on another device or with private tab, its still the same for me unfortunately
View on Reddit #45222491

Journeyj012@reddit

lmao the username is wilee wang
View on Reddit #45210168

big_ass_grey_car@reddit

Strange they chose to include a billion-dollar animation studio’s trademark in their name
View on Reddit #45184345

auradragon1@reddit

Developers are not good at naming things.
View on Reddit #45184533

YearnMar10@reddit

But they could have so an LLM for a good name :)
View on Reddit #45276504

FaceDeer@reddit

There are only two hard things in Computer Science: cache invalidation and naming things.
View on Reddit #45207381

Soft_Importance_8613@reddit

There are only two hard things in Computer Science: cache invalidation, naming things, and off by one errors
View on Reddit #45219384

llamabott@reddit

Actually, there are four hard things in-- never mind.
View on Reddit #45222872

YearnMar10@reddit

But they could have so an LLM for a good name :)
View on Reddit #45276474

Colecoman1982@reddit

Gotta hit that lawsuit quota...
View on Reddit #45201403

10minOfNamingMyAcc@reddit

And Disney's for some reason... /j
View on Reddit #45184510

big_ass_grey_car@reddit

what?
View on Reddit #45184828

10minOfNamingMyAcc@reddit

"Trans" but eh, it's a shitty joke.
View on Reddit #45185157

big_ass_grey_car@reddit

So you’re transphobic and an asshole, got it. You knew it wasn’t funny, but the 14 year old edgelord in you just couldn’t resist.
View on Reddit #45186352

10minOfNamingMyAcc@reddit

👍
View on Reddit #45186575

pooppooppoopie@reddit

Has a bunch of uses immediately
View on Reddit #45275975

AssistBorn4589@reddit

That's quite cool. It can do images, right?
View on Reddit #45181282

searcher1k@reddit

layer diffuse can do transparent images since for at least a year: [GitHub - lllyasviel/LayerDiffuse\_DiffusersCLI: LayerDiffuse in pure diffusers without any GUI](https://github.com/lllyasviel/LayerDiffuse_DiffusersCLI)
View on Reddit #45272377

TheDailySpank@reddit

Not sure about this model yet, but BEN (background eraser network) is really good at masking backgrounds away from images.
View on Reddit #45197192

AssistBorn4589@reddit

BEN is something else, it takes existing image and attempts to detect what is background. It often erases bit too much or needs to be fixed manually and it's not much better than tool already integrated to Krita.
View on Reddit #45212994

TheDailySpank@reddit

True. Different tools have different uses. Where I think BEN excels is in wispy gradient shit like hair (I've been working with a lot of hair lately). It's the only one that gets it to the quality I need consistently. I also use traditional segmentation pipelines when working on more complex masking setups or just plain ol' REMBG (when I need something fast). I do a lot of photogrammetry and 3DGS and these segmentation/masking tools have saved me countless hours of manual labor even compared to the initial learning curve.
View on Reddit #45219304

Eralyon@reddit

If you need still images with transparency, SD Forge does it with a plugin. (I forgot the name of it) But I remember installing it through the interface using the github link, and it worked as soon as I understood how to use it... It was with SDXL models.
View on Reddit #45201316

Any-Conference1005@reddit

[https://github.com/lllyasviel/LayerDiffuse](https://github.com/lllyasviel/LayerDiffuse)
View on Reddit #45211081

Eralyon@reddit

\^\^ yes this one.
View on Reddit #45218413

ThatInternetGuy@reddit

RemindMe! 3 months
View on Reddit #45270845

bot_exe@reddit

now this the kind of thing can be be used for specialized creative tools that artists will come to appreciate, at least those who have not been infected by the anti-ai mind virus.
View on Reddit #45248784

maddogawl@reddit

I know its early, but dang, this is showing real promise. Nice work on this!
View on Reddit #45232933

Former-Ad-5757@reddit

Am I wrong or is it just randomly ignoring the prompt in the demo video? If the prompt is "A forest floor being consumed by spreading magical fire" Then I would expect a forest floor somewhere. If the prompt is "Water splattering in mid-air" Then I would expect some air.
View on Reddit #45186383

procraftermc@reddit

mid-air probably just means floating in the middle. it can't exactly portray an invisible gas after all.
View on Reddit #45221242

Former-Ad-5757@reddit

Ask any other image or video model to portray air and it will portray something, this model (from the demovid at least) seems to just make the largest object transparent. It is impressive but it also seems difficult to get the wanted video, perhaps in a next run it makes the water transparent and it shows the air
View on Reddit #45227719

Zealousideal-Cut590@reddit

This is rad. Can't wait for it to appear in a video editing software near you.
View on Reddit #45221941

madaradess007@reddit

now this could be useful
View on Reddit #45178793

YRUTROLLINGURSELF@reddit

Could? Can we drop the bullshit, we *all* know this stands to become the manhattan project of generative cum technology
View on Reddit #45204091

mutes-bits@reddit

RemindMe! 2 months
View on Reddit #45203080

RemindMeBot@reddit

I will be messaging you in 2 months on [**2025-03-09 15:37:47 UTC**](http://www.wolframalpha.com/input/?i=2025-03-09%2015:37:47%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/1hx7421/transpixar_a_new_generative_model_that_preserves/m68lfoh/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F1hx7421%2Ftranspixar_a_new_generative_model_that_preserves%2Fm68lfoh%2F%5D%0A%0ARemindMe%21%202025-03-09%2015%3A37%3A47%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201hx7421) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|
View on Reddit #45203145

parzival-jung@reddit

can’t this be used to make textures and stuff like that ? can it handle layers of transparency / opacity settings?
View on Reddit #45201329

Fun_Yam_6721@reddit

this seems like it will help physics modeling
View on Reddit #45199582

GammaScorpii@reddit

What res?
View on Reddit #45190409

SgathTriallair@reddit

Was this something that was difficult for AI before? I haven't played enough with AI video to know what it is and isn't good at.
View on Reddit #45184154

mikael110@reddit

Most AI models that process video and photo can only produce RGB output. To produce/maintain transparency they have to output RGBA. In simplified terms the reason for this is that adding an additional image channel that has to be processed adds complexity and processing work, regardless of whether the thing you are processing really needs transparency or not. And given that over 90% of images and video don't contain transparency, it makes sense that people training models would choose to exclude it.
View on Reddit #45186651

ApplePenguinBaguette@reddit

It couldn't do it at all as far as I'm aware
View on Reddit #45184897

sunshinecheung@reddit

wow
View on Reddit #45186648

Roth_Skyfire@reddit

That's actually super useful.
View on Reddit #45185674