NVIDIA shows Neural Texture Compression cutting VRAM from 6.5GB to 970MB

[-]

Seanspeed@reddit

It's definitely promising tech in the long run.

We better hope that AMD's next gen hardware that will be used for new consoles will be capable of it as well so it can actually be adopted by devs properly, since these technologies dont seem like the sort of thing you can relatively easily just inject into the pipeline ala regular DLSS. I know AMD are also working on most all this stuff, just a matter of having it ready with RDNA5 and Playstation 6 and whatnot.

Because if this technology isn't standardized, we're all still gonna need ever more RAM in general.

[-]

MrMPFR@reddit

100% RDNA 5 will have this.
I also saw AMD using block compression to encode neural weights. That also benefits tiny ML models like MLPs: https://patentscope.wipo.int/search/en/detail.jsf?docId=US476835527
This might be one aspect of universal compression.

[-]

Ebih@reddit

Does RDNA 4 support cooperative vectors? I'm wondering what sort of shelf life that architecture will have on Linux

[-]

MrMPFR@reddit

IIRC they have support for all the new SM 6.10 stuff except OMM and SER.

Subpar. RDNA 5 is the fine wine long term µarch.

[-]

Ebih@reddit

Do you know how SER support differs between Direct X and Vulkan? I'm wondering how much the Steam Machine will be able to benefit from being Vulkan based?! Cooperative Matrix etc...

Boosting Ray Tracing Performance with Shader Execution Reordering: Introducing VK_EXT_ray_tracing_invocation_reorder

[-]

MrMPFR@reddit

Rn vulkan is just NVIDIA extensions except the one you linked too IIRC. Always behind IHVs and MS. AMD isn’t bothering because RDNA 4 doesn’t support OMM or SER.

Also steam machine is RDNA3 so very very weak matmul. Not gonna be able to do anything beyond simple ML. NTC and other stuff too demanding.

I doubt it and you need RDNA 5 or nextgen NVIDIA to do this stuff easily.

[-]

Ebih@reddit

AMD support is mentioned in that article, so I'm not sure how the two differ? Can "out of order" memory access be leveraged to do something similar on the hardware front? Would they both offer software SER if not? I'm also not sure how much neural texture block compression differs from NTC?!

[-]

MrMPFR@reddit

They technically support it but don’t reorder threads so there’s no perf gain. No OoO mem is useless here and SW can’t do it properly. You need HW which only intel and Nvidia has rn.

NTBC is using ML to compress BCn further. NTC whether Nvidia or Intel requires matmul logic. Very inference heavy.

[-]

Ebih@reddit

Interestingly it seems like VK_EXT_ray_tracing_invocation_reorder was added to the 26.2.1 Vulkan drivers in Febuary.

"The ray tracing pipeline API provides some ability to reorder for locality, but it is useful to have more control over how the reordering happens and what information is included in the reordering. The shader API provides a hit object to contain result information from the hit which can be used as part of the explicit sorting plus options that contain an integer for hint bits to use to add more coherency."

I'm not sure how this differs from the "Limitation: “MaybeReorderThreads” does not move threads"?!

[-]

GARGEAN@reddit

>We better hope that AMD's next gen hardware that will be used for new consoles will be capable of it as well

All that tech, alongside most other cool stuff NVidia is working on, is part of Cooperative Vector - basically an universal integration API. So as long as new AMD hardware will be Coop Vectors-accepting (and it will 100% be) - that stuff will work on AMD.

[-]

Seanspeed@reddit

Yea, I'm cautiously optimistic.

But still cautious. Cuz it's really an important part of all this if we want it to be used in a significant way.

[-]

MrMPFR@reddit

Yeh. SM 6.10 as confirmed which also covers the pure matmul stuff beyond Cooperative vectors, that I can't recall. Preview late summer, shipping EoY 2026. All in the recent NVIDIA GDC uploads on YT dev channel.

Also RTX MG getting standardized along other stuff. SM 6.10 will be a huge deal.

[-]

harkat82@reddit

Pretty sure the next gen AMD hardware should be capable of something similar. I feel like I've heard something about NTC tech being used with the next Xbox which is RDNA 5 but I can't remember where I heard that.

[-]

MrMPFR@reddit

Xbox's Jason Ronald's presentation at GDC heavily hinted at this. It was literally on one of the slides.

[-]

capybooya@reddit

Since graphics is changing rapidly with ML/AI, can we be sure there would be no other use for plenty of VRAM than textures? I mean, even if this tech was adopted today I would not really expect VRAM to go down, just stagnate, but looking far ahead into the future there sure could be other reasons to have plenty of VRAM?

[-]

rain3h@reddit

Frame gen uses vram, DLSS5 will use a truck load of vram.

While the consensus is that these are bad, they are the future and NTC leaves much more overhead available for them.

[-]

GARGEAN@reddit

Disregarding DLSS 5, consesus that framegen is "bad" only exists within small die-hard corners and separate heads. In whole it is a good tech with clear use cases.

[-]

MrMPFR@reddit

A interesting tech held back by fundamental issues like the latency aspect. Still in beta stage.

By the time they do anything like what's proposed here with reprojection that will completely change the game and should end all arguments against framegen:
https://patentscope.wipo.int/search/en/detail.jsf?docId=US476835821

Based on what I've seen indicated with RDNA 5 now I think we're getting that breakthrough nextgen. That's the killer feature of 60 series. Framewarp + framegen.

[-]

MrMPFR@reddit

Procedural assets, work graphs, ML asset compression all reduce VRAM at iso-fidelity. Like u/Seanspeed said devs will have many levers to pull.

Will be interesting to see how nextgen console spend their VRAM budget.

[-]

Seanspeed@reddit

Well it's not just textures, it's draw distances and general environmental density and all this stuff. Virtual geometry ala Nanite is also mildly heavy on VRAM. These things aren't purely VRAM-related the same way textures are, but making more room to push other aspects of graphics would definitely be a way to take advantage of this technology without necessarily reducing VRAM requirements outright. It's been pretty standard for a long time in development that if you give developers greater capabilities and better optimizations, they're often gonna find ways to use that headroom to push ambitions rather than just reduce hardware requirements.

[-]

titanking4@reddit

These “Neural techniques” are all within the research, and each happens to have a different structure in execution and resources.

You have the ML “post processing” effects like FSR or DLSS. And then you have the ML “inline” where you run weight-training and inference in the actual rendering path. Ray-tracing and path-trading of course is another characterization of workload.

NTC still early as it needs to show clear advantages over the current BC7 algorithm in terms of compression ratio, information preservation, and execution efficiency.

But we are slowly getting there as “execution” becomes cheaper every generation, while “memory” capacity and BW utilization become more important.

Being able to cut memory BW utilization saves huge amount of costs for products and enables games to have super high fidelity textures become usable on even low vram products.

[-]

MrChocodemon@reddit

Cute, how's it look in motion and will it be vendor agnostic?

[-]

Sopel97@reddit

how's it look in motion

like any other texture?

[-]

MrChocodemon@reddit

Because it is neural encoded on the fly. Seeing how neural processes flicker with ray reconstruction and image reconstruction I am not confident that on the fly image compression will be super stable. I expect a lot of artifacting that they conveniently didn't talk about.

[-]

Sopel97@reddit

There's nothing "on the fly" about this. It's deterministic compression.

[-]

ShadyMagician@reddit

Stop bro, don't ask actual questions least you be deemed a hater. Vendor agnostic in the big 2026? Where's my dlss 6 neural automatic game player. I just wanna look at my games, not play em.

[-]

ResponsibleJudge3172@reddit

It's not temporal. Motion has 0 effect on it. And no, Nvidia didn't give AMD and Intel free research value

[-]

dampflokfreund@reddit

Its interesting how they show these technologies off with a RTX 5090. Something tells me that current GPUs will have trouble running these AI technologies in real time and rendering the game at the same time. Feeling is, it might be an RTX 60 series exclusive feature or just run very slow on Blackwell and lower.

But man, NTC would be a killer feature for the RTX 60 series, a feature people would actually care about. Under the condition of course, they aren't going to skimp on VRAM because of this tech lol

[-]

Fox_Soul@reddit

The 6090 will probably have the same VRAM as the 5090. The other 60 series models will probably have the same, or lower since... well you dont need it anymore! Also it only works on new releases. There will only be 3 releases that year that support it and then you'll have to wait 8 years for the majority of games to support it.

You will own nothing and will be happy about it.

[-]

Nicholas-Steel@reddit

I think what they're saying is the 6000 series will feature a notable upgrade to the Tensor Cores to properly facilitate the AI features.

[-]

Seanspeed@reddit

The tensor cores have been the one aspect of Nvidia's architectural generations that have improved a fair bit, but the problem is that they've done so heavily based on increasing support of lower level precision acceleration that AI can subsist on. Which is ultimately just low hanging fruit.

But once that low hanging fruit is picked, which I think we're getting very much towards, it's much harder to make the same kind of gains.

[-]

ResponsibleJudge3172@reddit

30 series and 40 series doubled individual tensor core performance at the same FP16 vs previous gens

[-]

MrMPFR@reddit

They better do. Been stuck at Turing FP16 dense Matmul per SM since Turing. Used tricks like lower precision to drive gains. Time to start redesigning the ML pipeline + beef it up.

They need to because RDNA 5 is likely using a cut down version of CDNA 5 with full feature set.

[-]

GARGEAN@reddit

>The other 60 series models will probably have the same, or lower since... well you dont need it anymore!

No. This tech is not a universal post-process API, it requires per-game integration. Old memory hogs will stay the same.

At worst 60 series will have same VRAM as 50 series. No way it will drop below.

[-]

DerpSenpai@reddit

They might do more 8GB cards though

[-]

MrMPFR@reddit

No the worst you're getting is 9GB for a anemic 6060 config. They can amputate mem shoreline with new ultrafast GDDR7 at 36gbps and 24Gb densities.
It's gonna be 12GB-48GB with 6090 being ludicrously overpriced. $3K prob.

[-]

DerpSenpai@reddit

Only tech illiterate people care about GDDR width. The only thing that matters is bandwidth and now we have super fast memory that mid range GPUs don't need.

Reducing bus width actually makes it cheaper. GDDR7 is only expensive because it's new, in 1/2 years, it will end up the same price as GDDR6. Right now the difference for 8GB is 10$ lmao

[-]

MrMPFR@reddit

8GB isn't happening with post 2GB unless it's some crazy +40gbps design based on 4GB densities and 64bit bus. Might happen with 5050 successor xD (7050).

It still proves that core hasn't scaled like it used to, but I know why. Back in the day more BW = more compute.

But 12GB 6060 using 36gbps 32Gb chips over 96bit is totally doable.

Should end up cheaper TBH. New chips are getting much much higher densities.

It was before the entire market went crazy. Hope we see normalization by nextgen.

[-]

GARGEAN@reddit

More how? There is 5050, 5060 and 5060Ti 8gb. That's 3 SKUs. Nowhere to squeeze any more for 60 series, unless we will start imagining things like 6050Ti.

[-]

BighatNucase@reddit

The 6090 will probably have the same VRAM as the 5090.

I don't know why this is said as if it would be some terrible thing. The 3090's VRAM isn't even an issue in any real world gaming use case (no Skyrim with 11 billion mods isn't real world).

[-]

sylfy@reddit

The good thing about deep learning models is that they can quantise the models and run them with a lower compute budget, with some tradeoffs of quality for performance. So yes, they’ll obviously show them off on their top end cards for the best results, but there’s no reason they won’t work on previous generations or lower end models.

[-]

nanonan@reddit

Not really. Real time support on 4000 series and up. No support at all below 2000 series.

[-]

StickiStickman@reddit

That is literally wrong:

The oldest GPUs that the NTC SDK functionality has been validated on are NVIDIA GTX 1000 series, AMD Radeon RX 6000 series, Intel Arc A series.

[-]

sylfy@reddit

At this point, you’re talking about an 8 year old card.

[-]

jocnews@reddit

The problem is requiring compute budget for such a basic level operation as texture sampling, at all.

Regular compression formats get sampled with zero performance hit. Which means this thing will cut into framerate while the GPU vendor pockets the money saved on VRAM.

[-]

StickiStickman@reddit

Which means this thing will cut into framerate while the GPU vendor pockets the money saved on VRAM.

You know what also cuts into framerate? Running out of VRAM.

[-]

jocnews@reddit

Yeah but that's irrelevant here.

The issue is that Nvidia kind of has a neural network acceleration hammer in their hands and started to see everything as a "this could use neural networks too" nail. Many things may be (neural materials seem to make sense to me), but IMHO, texture sampling is not.

[-]

Vushivushi@reddit

It is absolutely a problem of VRAM capacity.

Memory has become the largest single item in a device's BoM. In a graphics card, it can be as much as half of the total cost. Though we may not always be starved on VRAM within games, the GPU vendors are starved on VRAM as a matter of cost.

In the example they showed, they saved ~5.5GB using NTC. DRAM ASPs are rising to $15/GB. That is >$80 of savings. The additional cost in compute silicon is likely much lower than $80. $80 could get you 40% more area on a 9070XT/5070 Ti.

Reducing the memory dependency also reduces costs on the GPU silicon as they can cut memory bus again. Sound familiar? The GPU vendors have been very prudent in the way they've been cutting the memory bus for low to mid-range GPUs over the years.

[-]

StickiStickman@reddit

Do I really need to explain to you how a software solution that reduces texture VRAM 10-20 fold is better than just adding a couple more GB of VRAM on?

[-]

dustarma@reddit

Extra VRAM benefits everything, NTC only benefits the particular games it's running in.

[-]

StickiStickman@reddit

So? Have fun buying a GPU with 240GB of VRAM I guess if you want 10x gains everywhere?

[-]

Vushivushi@reddit

Reducing memory cost is the single most critical thing they can do right now.

[-]

elkond@reddit

there's absolutely a reason, it's called quantization lmao

m/l models are not recommended across the board not because k is better but because Ampere cards dont have hardware FP8 support, if u quantize a model to a precision that requires hardware emulation u get fuckall improvement

99% chance they are using 5090 not (well not fully) because models are heavy but because blackwell has native FP4 support

[-]

Kryohi@reddit

I highly doubt this is using FP4

[-]

MrMPFR@reddit

FP8 and INT8.

[-]

94746382926@reddit

Well obviously it's going to be a blackwell and newer feature, but theres no reason a 5060 for example couldn't run it. Is that not a low end card?

[-]

elkond@reddit

no but why on earth would you showcase a feature not on a flagship that is driving your highest margins?

[-]

Plank_With_A_Nail_In@reddit

Its small quantised models have a huge decrease in quality not just "some".

[-]

asfsdgwe35r3asfdas23@reddit

Even if it can run, Nvidia will never support old hardware. In the same way they did not support frame gen in older GPUs when they are perfectly capable of running it. They do the same every generation, there is 0 chance the will release this for the current GPUs.

[-]

StickiStickman@reddit

DLSS 4.5 literally just released on older generations, what are you smoking?

In the same way they did not support frame gen in older GPUs when they are perfectly capable of running it

No, they literally aren't. You're just spreading blatant lies.

[-]

Seanspeed@reddit

To be clear, we dont know either way. We assume they aren't purely cuz Nvidia has said so.

From the same company who tried to say that a 5070 was as powerful as a 4090.

[-]

StickiStickman@reddit

We absolutely do know that older cards don't have the hardware to run it. We know for a fact that the hardware on 2000 and 3000 cards is not fast enough for it to be a net gain.

[-]

doscomputer@reddit

the examples in the paper are also from absurdly high detailed models/textures

This is a neat tech but I think actual use cases are limited, seems more a tool for devs who don't want to fine tune any meshes or assets.

[-]

IIlIIlIIlIlIIlIIlIIl@reddit

But man, NTC would be a killer feature for the RTX 60 series, a feature people would actually care about.

Is it, though? It's an under the hood feature with no real impact to the end user. VRAM usage being the bottleneck in games is an extremely rare situation that only a subset of 4K gamers run into.

[-]

Seanspeed@reddit

But man, NTC would be a killer feature for the RTX 60 series, a feature people would actually care about.

I mean, if it only works well on 60 series parts and isn't relatively simple to implement it, it wont be adopted by devs all that widely. Similarly, if similar tech isn't usable on RDNA5 and new consoles, devs will be more hesitant to take the resources to implement it.

I think the benefits here are more long-term, once standardization is achieved. Then it opens up a lot of doors, to make game development a bit easier, to push graphics quite a bit harder in terms of memory footprint, and of course to enable ~~us to not need to buy increasingly higher amounts of VRAM with our GPU's~~ Nvidia to stop giving us more VRAM while still increasing prices and profit margins.

[-]

StickiStickman@reddit

People said the exact same about DLSS, yet here we are.

You're forgetting that Nvidia has a 95% market share.

[-]

Seanspeed@reddit

I'm so fucking tired of these ultra-generalized 'but people said' stuff to try and dismiss concerns.

*I* never said anything like that. I'm not most people.

[-]

MrMPFR@reddit

RDNA5 ML HW is superior to 50 series. Supposedly derived from CDNA5, obviously cut down matmul, VGPR and prob TMEM to avoid exploding area budget. Prob some novel new stuff too.
NVIDIA has been feeding gamers ML scraps since Turing. FP16 dense hasn't gone up per SM basis. Only tricks such as quantization.

Expect RDNA 5 and 60 series to annihilate existing offerings.

100% and while SM 6.10 standardization is great, I'm more interested in DirectX next and co-design with Helix/RDNA 5.

All this stuff they've mentioned so far lowers VRAM footprint. Same with work graphs and procedural assets. I wonder what they'll spend the freed and additional VRAM budget on for nextgen consoles. Gonna be tons of gigabytes to play around with.

Only happening if 6060 is 9GB 96bit design. Nextgen GDDR7 is 3GB density. I hope AMD can force them to stop selling us anemic configs + their offerings are more viable than rn.

[-]

witheringsyncopation@reddit

Fucking of course they’re going to skimp on VRAM. They have with every generation to date, and this is even more of an excuse to do so, especially with the insane prices of memory.

[-]

capybooya@reddit

Even if everyone started developing with this technology today, there'd still be coming out regular games in 5+ years that need traditional amounts of VRAM. Nvidia is greedy, but not stupid so the worst case is them not increasing VRAM with the 6000 series.

[-]

Seanspeed@reddit

Nvidia is greedy, but not stupid so the worst case is them not increasing VRAM with the 6000 series.

I think most people would say that's the same thing as 'skimping' on VRAM.

Outside of flagship GPU's, they've always been bad about this.

[-]

abrahamlincoln20@reddit

The leaked specs show they aren't going to skimp on VRAM. Of course, they're just leaks...

[-]

GARGEAN@reddit

They are not even leaks. They are poke in the sky based on nothing but vibes. There are no chips taped out to leak them.

[-]

ResponsibleJudge3172@reddit

No, they are leaks.

[-]

GARGEAN@reddit

Lol. No.

[-]

Ok-Parfait-9856@reddit

Sorry to ruin your doomer jerk but no, it will likely work on 4000 series and definitely 5000 series. There’s even a dp4a fallback, suggesting 3000 series support

[-]

dampflokfreund@reddit

You can also run Raytracing on a 1080, it just won't be very fast. I assume this will a similar situation once it gets used in games.

[-]

StickiStickman@reddit

Nvidia literally says the minimum is a 1000 series card, but the recommendation is a 4000:

Minimum: Anything compatible with Shader Model 6 (will be functional but very slow) [*] Recommended: NVIDIA Ada (RTX 4000 series) and newer.

[-]

AsrielPlay52@reddit

GPU for NTC decompression on load and transcoding to BCn:

Minimum: Anything compatible with Shader Model 6 [*]

Recommended: NVIDIA Turing (RTX 2000 series) and newer.

GPU for NTC inference on sample:

Minimum: Anything compatible with Shader Model 6 (will be functional but very slow) [*]

Recommended: NVIDIA Ada (RTX 4000 series) and newer.

GPU for NTC compression:

Minimum: NVIDIA Turing (RTX 2000 series).

Recommended: NVIDIA Ada (RTX 4000 series) and newer.

These are taken from Nvidia NTC SDK itself.

[-]

dampflokfreund@reddit

I know that. As I said, it will likely run but it might degrade performance too much on older architectures.

[-]

AsrielPlay52@reddit

They wouldn't said Recommended: 40 series if that were the case.

they would just listed 50 series and Newer instead

[-]

Jumpy-Dinner-5001@reddit

Its interesting how they show these technologies off with a RTX 5090.

Why? That's just normal for tech demos.

[-]

Loeki2018@reddit

No, you take the card that would not be able to do it because it's bottlenecked by VRAM and showcase it actually works. Everything runs on a 5090 lol

[-]

Adonwen@reddit

That doesnt sell 50 series cards tho, that just says your old card still has life. They dont get money on already paid things

[-]

reallynotnick@reddit

There’s plenty of 50 series cards that don’t have 32GB of VRAM. I mean if the tech demo showed off something that would only run with like 100GB of VRAM on 32GB that could be interesting, otherwise the demo is only academic with no visible benefit on the 5090.

[-]

nittanyofthings@reddit

It's probably better to assume existing cards won't really be able to do the real version of this. Like expecting a 1080 to do ray tracing.

[-]

dampflokfreund@reddit

Yeah, it will definately run but be very slow. Similar to how DLSS4.5 runs on Turing and Ampere cards, just too much of a performance hit to be worth it. Although it will still be faster than running out of VRAM on such cards, so there's still an use case for it.

[-]

CarsonWentzGOAT1@reddit

Tell me a single tech company that produces their own hardware that does this

[-]

Jumpy-Dinner-5001@reddit

No, that's nonsense.

[-]

yamidevil@reddit

Yep. Even earlier they said it'll require strength. So 5060 will benefit form this much more than 5050 since it's a weaker card

[-]

cultoftheilluminati@reddit

Under the condition of course, they aren't going to skimp on VRAM because of this tech lol

inb4 a 8gb or a 4gb 6090 because "the more you spend, the more you save" in vram. /s

[-]

zushiba@reddit

Oh good. NVidia is going to start selling video cards like how all toilet paper is sold now with “4gb = 12gb” plastered on the box.

[-]

Aggravating-Dot132@reddit

That feature would require devs to make 2 versions of the same game. One us for normal GPUs and consoles, and another for that feature.

This is a no go, unless that type of tech is wide spread.

[-]

SovietMacguyver@reddit

Is this simply discarding detail and then recovering it, lossy like, through an AI model?

[-]

ResponsibleJudge3172@reddit

No, the textures are already always compressed. They are now compressed better

[-]

SovietMacguyver@reddit

I get that, I'm asking how.

[-]

jocnews@reddit

Not mentioned: FPS drop from replacing efficient hardware sampling with invocation of neural networks for every texture.

[-]

StickiStickman@reddit

How does it not make sense if it can reduce VRAM requirements 10-fold? What?

[-]

Darrelc@reddit

How does it not make sense if it can reduce VRAM requirements 10-fold? What?

You ever heard the phrase "There's no such thing as a free lunch?" there's definitely not ten free lunches

[-]

StickiStickman@reddit

DLSS exists.

[-]

Darrelc@reddit

Yep and I use it to upscale 480p to 8k with zero image loss.

Hell yeah free lunches

[-]

jocnews@reddit

Performance is always the harder issue.

And no, it won't cut VRAM requirements in actual games anywhere near 10x, that is just in contrived demos made for showcase purposes (note that some of the older papers and demos claimed their gains by comparing to uncompressed textures instead of comparing with state of the art compressed textures as used in games now, to look better - which is of course cmpletely bogus).

[-]

StickiStickman@reddit

Dude, stop making shit up.

I literally messed around with the SDK - anyone can. It's all public on Github.

A 10x is absolutely doable. In best case scenarios it's much higher than that even.

Also, they always compared to raw texture AND BCn.

[-]

Sopel97@reddit

because we all know this cannot be implemented in hardware

[-]

jocnews@reddit

It's not, that's why you need cooperative vectors.

In theory, *everything* can be implemented in hardware. In practice, you find out you would have to have every texture sampler have something like a tensor core... and memory to hold the not so small neural network it uses to inference... which it has to swap out often as textures change. Unlikely to be very viable.

[-]

Sopel97@reddit

the biggest layer is a linear 64->64, that's 4096 operations. Blackwell tensor cores can do 16384 f8 FLOP per cycle. It's not that outlandish. https://newsletter.semianalysis.com/p/nvidia-tensor-core-evolution-from-volta-to-blackwell -> Tensor Core Size Increases

[-]

Psychological-Name-5@reddit

So are we still hating the ai, or now that it gives more performance it's good? Genuinely asking.

[-]

S48GS@reddit

"neural compression" exist... since 80s on paper... and since 00 on realtime demos

it just form of compression

modern "AI" has nothing to do with it

and there were always huge limitations and insane "cost" to compress these

today to compress single 512x512 texture (simple lightmap) it need hours on 5090

compressing some more complex detailed like "leaf" or "grass" textures - is days-weeks of time for single texture

to achieve quality - it need insane time

then there limitation of "size" - on modern gpus it is literally few kb - if more than 4kb then gpu out of fast cache and fps will drop to 30 or lower

Nviida own technology they showed in HL2 RTX - streaming of 16k textures - it also fit to small vram and just stream - this is more interesting to me.... if gamedevs actually start dynamic memory management

[-]

bubblesort33@reddit

they are doing to need to free up every MB of RAM possible to support DLSS5 on 12 GB cards.

[-]

StanGoodspeed618@reddit

6.5GB to 970MB is a 6.7x compression ratio which is insane for texture data without visible artifacts. The real impact isnt just VRAM savings - its memory bandwidth. Texture fetches are one of the biggest bottlenecks in rendering pipelines. If this lands in mainstream engines it could fundamentally change what midrange GPUs

[-]

FitCress7497@reddit

All those require implementation from the start of the development right? Not something you can just add like DLSS.

If a game is designed with this, how will it run on older hardwares?

[-]

hak8or@reddit

, how will it run on older hardwares

This is Nvidia, they couldn't care less about that right now. They will want you to buy a new card, even better if cards are extremely expensive still because then more people will be forced to use their Nvidia cloud subscription instead which is higher margin and more stable of a cash flow for them.

[-]

shadowtheimpure@reddit

Nvidia won't care, but developers will as it would severely limit their potential buyers to only people with compatible hardware.

[-]

StickiStickman@reddit

Oh no! Only people with a ... Radeon 500 or GTX 1000 GPU can play their game now? Which is everyone?

[-]

xHakua@reddit

Sounds like Radeon

[-]

r_z_n@reddit

Given that they just released DLSS 4.5 for 20 and 30 series cards, this is objectively untrue.

[-]

Demented_CEO@reddit

It's almost like hell has frozen over when even Nvidia is less hostile towards its users than AMD...

[-]

Seanspeed@reddit

I mean, Nvidia hasn't even tried to get MFG working on older GPU's. They say it 'requires' newer hardware, but we dont really know that. We only know different for AMD's situation because FSR4 is semi-open source and people used a workaround to get it work for RDNA2/3, with slightly worse quality and worse performance. It's also still entirely possible AMD is still working on getting it released for RDNA2/3 officially, using work done with Sony for FSR4.1 which also would have required a different calculation method than RDNA4.

Offering DLSS4.5 to 20 and 30 series GPU's costs Nvidia nothing with no additional work required, but it's all the same completely unusable because the performance hit is too significant to justify using over even DLSS2, let alone DLSS4(which also has significantly reduced usefulness).

[-]

SecureNet5333@reddit

what? everyone is using dlss 4

[-]

Seanspeed@reddit

If you're on a 20-30 series GPU, DLSS4 isn't an inherent win over DLSS2 in any situation. It is sometimes, but not always.

[-]

IIlIIlIIlIlIIlIIlIIl@reddit

Is always an inherent win, DLSS4 is Miles ahead of 2 (and even 3, and 3 was already noticeably ahead of 2 always).

4.5 vs. 4 is the only time there has been considerations about which to use.

The performance overhead of 4.5 over 4 on a 30-series is also not that high. Combined with the quality improvements, I personally always force Profile M even if I have to go down a quality level; On my 4K screen with a 3080 I almost always prefer 4.5 (M) on Performance than 4.0 (K) on Balanced.

[-]

SecureNet5333@reddit

its always an inherent win
because you can simply go down to dlss balanced and have better image quality with the same performance

[-]

Demented_CEO@reddit

You're conflating so many things here. There's absolutely no such thing as "semi-open source" and if AMD users (e.g. me included as an RX 6900 XT user) always have to wait longer for features that end up being worse with worse performing hardware, then something is absolutely wrong on AMD's side and they haven't projected the most trustworthy image in these times. Even "Ngreedia" seems to accommodate better.

[-]

Seanspeed@reddit

The whole reason we got to see how FSR4 worked on older RNDA GPU's is cuz AMD released FSR4's open source code by mistake. And then people used it to make DP4A implementation. AMD had always planned to make FSR4 open source in time, but they did so prematurely.

This sub has completely lost all legitimacy. It's just r/nvidia2 at this point.

I'm literally just stating basic facts and y'all are mass downvoting me for it cuz it goes against the narrative y'all want to believe.

[-]

Sevastous-of-Caria@reddit

Well by not rennovating its cuda+tensor core architecture it has a tradeoff of stagnation of performance. Compared to rdna4 playing catch up

[-]

EdliA@reddit

That has always been the case. People just hype AMD because they want competition, their support however for older GPUs has never been that great.

[-]

Jumpy-Dinner-5001@reddit

Always has been

[-]

LeadIVTriNitride@reddit

Let’s see the single digits percent of people using 4.5 on Turing and Ampere cards, because the performance is bad relative to output.

[-]

GARGEAN@reddit

Meanwhile most of those use 4, which is still better than FSR 4 but has little to no performance overhead.

[-]

Creepy_Accountant946@reddit

The point is they gave the option,while AMD older cards can run the latest fsr but they choose to be greedy and not support them

[-]

airfryerfuntime@reddit

People still wanted it on old cards. Nvidia was like "sure, ok, knock yourselves out", and now they complain that performance sucks. Like what do you expect? 20 series is ancient and just doesn't have the hardware to handle it DLSS 4.5 very well.

[-]

r_z_n@reddit

They’re really old and lack the hardware accelerators, what do you expect? They still gave the user the option.

[-]

VaultBoy636@reddit

They released it to avoid bad PR (see fsr 4 around amd). dlss 4.5 tanks performance on 20 and 30 series. The only cards that can reasonably run it are the 3080ti, 3090, 3090ti, but with the performance gain and image quality drop (yes, it's still worse than native), you might as well use dlss 4.0 or play at native

[-]

r_z_n@reddit

Yes, because it requires hardware to accelerate it that the cards back then didn’t have. But they still give you the option.

[-]

Seanspeed@reddit

I mean, AMD could do the same for FSR4, it would just be pointless. Nvidia aren't even bothering to try and rework DLSS4 or 4.5 into running better on older GPU's. Which is exactly what AMD has to do to make FSR4 useful on RDNA2 or 3.

[-]

r_z_n@reddit

From what I have seen and understand (I don't have an RX7000 series card to test personally), the impact from running FSR4 on 6000/7000 series cards is not as significant as running DLSS 4.5 on RTX 20/30 series.

The big difference though is that DLSS 4 is quite good, while FSR1/2/3 are awful, so the value proposition is different.

[-]

Creepy_Accountant946@reddit

Nvidia is not AMD though,they do actually support older hardwares

[-]

Nicholas-Steel@reddit

It's basicaly using AI as a lossy compression algorithm. It shouldn't require it to be something that needs to be implemented early in a project. You get copies of the assets before they were compressed in the traditional method and just re-compress them using this new method and send 'em on down to gamers PC's as a patch.

[-]

bogglingsnog@reddit

Don't modders already often create compressed textures to reduce vram? I remember doing this for Skyrim and New Vegas back in the day.

[-]

philoidiot@reddit

They do, virtually all textures in pc games are in the BCn format. NTC has much better compression ratio at the same quality but it requires more expensive computation at runtime, that's the trade-off.

[-]

zopiac@reddit

This is just computation to load the textures into the scene? Or as a constant draw when NTC is being used?

[-]

philoidiot@reddit

Constant draw on each access if they're kept in the small NTC format. You can also decompress them to a more usual format once in vram I believe, but they'll be bigger.

[-]

jocnews@reddit

If you decompress them to regular format, you don't save any VRAM during runtime, only disk space.

If you only decompress them on every sampling, you save VRAM footprint of the game, but you lose overall performance because do you really think using neural networks for texture sampling will be free?

[-]

StickiStickman@reddit

It will also look worse than if you just stored the regular format in the first place, dues to recompression losses and artifacts

That is not true at all. You can have much higher detail in NTC textures since they're so much smaller in fact, so quite the opposite.

[-]

philoidiot@reddit

I think you meant to answer another post.

[-]

jocnews@reddit

nope

[-]

philoidiot@reddit

Then why are you explaining things to me that I literally wrote myself lol

[-]

Plank_With_A_Nail_In@reddit

The whole point is they stay compressed all of the time, uncompressing them uses VRAM which is the thing we are trying to conserve.

[-]

zopiac@reddit

Honestly I skimmed it but nothing stuck. I can read, but apparently comprehension is beyond me. Couldn't watch the video at the time though, I'll own up to that!

[-]

AsrielPlay52@reddit

Really depend on the card

For 20 and 30, it's Decompress on Load. The benefit is just smaller file sized.

for 40 and 50, it's real time

[-]

AsrielPlay52@reddit

for RTX series cards

the 20 and 30 get Decompression on Load

basically, smaller game size and that's it

For 40 and 50, you get real time decomp

[-]

StickiStickman@reddit

Not quite right. NTC can also decompress to BCn on load. It doesn't have to be real-time.

[-]

f3n2x@reddit

Virtually every texture in the last 25 years or so has been compressed. S3TC is from 1998.

[-]

GARGEAN@reddit

And all those textures in those last 25 years were uncompressed during loading into VRAM and occupied full uncompressed size in VRAM.

NTC doesn't. That's the whole point.

[-]

f3n2x@reddit

No they were not. The whole point of those specific formats is to store them compressed in VRAM and have the texture units decompress individual samples without having to decompress the entire buffer. If you just want to save space on disk you might as well use jpg.

[-]

GARGEAN@reddit

Hmmm, yeah, it seems it was a big dum-dum on my end. So main point of NTC then is in much superior compression ratios, but without fundamental change in VRAM occupancy logic.

[-]

MrMPFR@reddit

Sacrificing ms for VRAM reduction.
But TBH I'm more interested in Neural materials.

[-]

f3n2x@reddit

Exactly.

[-]

BinaryJay@reddit

Difference is one increases compression ratio at the cost of just lowering the quality big time and the other gives you better ratios without affecting the quality much at all (according to the demos of course we don't have any software to see for ourselves yet).

[-]

bogglingsnog@reddit

But a key feature of these optimized texture packs is that there is virtually no visual difference but a huge vram reduction. The modder can just add compression up to the point where it starts to visually impact the texture, which at a cursory glance seems to be the same thing the AI tool is doing.

[-]

StickiStickman@reddit

Since you got a bunch of answers from people who don't know what they're talking about (including OP):

Yes, it works on older hardware. Kind of.

There's two modes: Interference on sample, where there's never an actual texture in VRAM and just a ML model that gets sampled instead. There's also NTC to BCn on load, which converts the models into normal block-compressed textures in VRAM.

For both you save huge amounts of disc space, but only for the real time sampling you also save VRAM. But still: Developers could only bundle NTC with their games and old GPUs can just convert them to normal textures on game launch / level loading.

For what cards can run what, Nvidia has a section on their GitHub page:

GPU for NTC decompression on load and transcoding to BCn

Minimum: Anything compatible with Shader Model 6
Recommended: NVIDIA Turing (RTX 2000 series) and newer.

and

GPU for NTC inference on sample

Minimum: Anything compatible with Shader Model 6 (will be functional but very slow)
Recommended: NVIDIA Ada (RTX 4000 series) and newer.

[-]

nanonan@reddit

Nobody is going to solely distribute these textures. Developers aren't going to support only nvidia and only 2000 series or later.

[-]

StickiStickman@reddit

Developers aren't going to support only nvidia and only 2000 series or later.

They already are, nothing would change.

No to mention this is already hardware agnostic since it just uses a Vulkan and DX12 API.

[-]

FitCress7497@reddit

Ty that's a very clear answer

[-]

No-Improvement-8316@reddit (OP)

Yep. NTC and NM replace core pipeline components.

If a game is designed with this, how will it run on older hardwares?

Hard to tell. They don't talk about it in the video.

[-]

BlobTheOriginal@reddit

That's the fun part: requires modern Nvidia gpu

[-]

AsrielPlay52@reddit

for RTX series cards

the 20 and 30 get Decompression on Load

basically, smaller game size and that's it

For 40 and 50, you get real time decomp

I find that info from Nvidia's github repo on NTC

[-]

SJGucky@reddit

Thats not the problem. They would have to use several different core pipelines for different GPUs.

[-]

AsrielPlay52@reddit

That's...also not a problem.

Thanks to cooperation between Intel, Nvidia and AMD. Shader Model 6.9 has introduce a Cooperative Vector. A way that DX12 and Vulkan to use AI Neural stuff, cross vendor.

Nvidia's RTX is design with Coop Vector in mind. And Drivers handle how Coop Vector using their AI Cores across all vendor!

[-]

WTFAnimations@reddit

Still impressive tech. I just hope it isn't an excuse for Nvidia to keep the 60 series cards at 8 GB of VRAM.

[-]

AsrielPlay52@reddit

It wouldn't, it wouldn't make sense. The NTC Repo relies on Coop Vector, something new to Shader Model 6.9 that every GPU uses

[-]

GARGEAN@reddit

It is supported by 20 and 30 too, just not recommended to use due to performance overhead.

[-]

AmeriBeanur@reddit

So then… useless? Because wtf, we’ve been on this AI, Ray Tracing, DLSS bullshit for like 6 years and modern games still run like shit on the latest card.

[-]

TheHeatMaNNN@reddit

I still can't see your reply :))) I have a notification, but can't access it, seems deleted X_X ayway, saw the "whole page to agree" part, cheers, appreciate the effort <3

[-]

TheHeatMaNNN@reddit

I think it's because game development has fallen behind the hardware innovations, I would think it's natural to have a 3 years lag or more between releases. I saw a recent Steam hardware stat thing they do, and it was something like @27% have 8gb VRAM, 50%+ play on 1920×1080, it's not an economically good choice to implement the latest software while the market is still using old tech. Raytracing is still an option and not the "default" for games... my two cents :)

[-]

AmeriBeanur@reddit

Jesus Christ bro I wrote a whole ass page agreeing with you and stating why I do and it all went to shit as soon as I accidentally swiped left.

Long story short (because I want to express this), there hasn’t been a time in which games ran so shit on the latest hardware running it than now. These software fixes are a joke.

You mean to tell me you’re implementing software fixes onto your physical hardware because the cards can’t run the software adequately on raw power and therefore have to introduce latency and fake frames??? What a fucking joke. And then to say that only the latest model gets these software patches?

If we are at the physical limit of computation in order to get a steady 60FPS at 1440p with the latest engines and development tools, then burn those and go back to earlier engines. Those games ran at higher fps on the hardware of the time, were snappier, and even look better. Develop on those, work with the hardware you’ve got, not the hardware you hope to have one day. These experimental fixes mean nothing. Such a shitty situation.

[-]

mujhe-sona-hai@reddit

You say like it's a bad thing to develop new technologies

[-]

Due_Teaching_6974@reddit

That's fine but if all the other vendors (AMD and intel) don't make their own version it will fail like PhysX

[-]

Nexus_of_Fate87@reddit

Not alike at all.

PhysX was a third party tech developed outside Nvidia they later acquired. AMD (then ATI) also had a much larger portion of the market back then.

Nvidia comprises 95%+ of GPU sales now.

Also, one tech that absolutely disproves your claim is DLSS. That has been going strong for over a decade now, and it too requires explicit implementation by developers.

[-]

EmergencyCucumber905@reddit

Also, one tech that absolutely disproves your claim is DLSS. That has been going strong for over a decade now

What year is it???

[-]

TheMegaMario1@reddit

Yep, devs won't go out of their way to implement outside of being sponsored if it can't run on all consoles and requires ground up from the start implementation, noone is going to specifically say "oh you should just be playing on PC on specifically an nvidia GPU". Maybe it'll have some legs if the Switch 2 can run it, but that doesn't exactly have a boatload of tensor cores

[-]

GARGEAN@reddit

How many vendors can run DLSS? How many vendors can adequately run path tracing?

[-]

TheMegaMario1@reddit

But those technologies don't require ground up implementation, infact DLSS is mostly a drop in solution to the point people have been able to mod in using FSR over top it using the similar framework. To the second point that's goalpost moving because path tracing is a more generalized tech. Just because the other vendors can't run it well doesn't mean that they can't run it.

We're currently talking about something that would require from the start dedication that would require still doing it the traditional way to make it work elsewhere since there's no equivalent and it's not drop in.

[-]

sabrathos@reddit

Guys... have you actually looked at what neural texture compression is?

It's not a proprietary API. It's literally just running tensor operations from within a shader, using the card's AI hardware acceleration that all modern cards have built in.

It's done via cooperative matrix/vector operations, which are a standard that's been added to D3D12. AMD and Intel support it.

Same with shader execution reordering in Shader Model 6.9.

Even "RTX Mega Geometry" is being standardized in DirectX, with it arriving in preview in a few months. That's just the branding for streaming small virtual geometry cluster-level changes to the raytracing BVH rather than doing full BVH rebuilds.

The modern cycle has been that Nvidia starts with conceptualizing something, adds custom support for it in NVAPI/Vulkan, and works with Microsoft/Khronos to standardize it within a ~year.

DLSS is the only "Hairworks"-like functionality at the moment, and even it is in a good spot right now with drop-in compatibility with FSR via things like Optiscaler (and they did try to make Streamline, just the industry rejected it).

The only real problem seems to be that AMD and Intel are in the passenger's seat and not the driver's seat with advancing hardware standards, which is completely on them. But everything is a standard.

[-]

MrMPFR@reddit

I hope the shipped version is flexible enough to encompass the foliage RTX MG improvements for TW4.

That'll change with RDNA 5, but until then NVIDIA are pushing full steam ahead.

[-]

FierceDeityKong@reddit

Switch 3 games will use it probably.

[-]

MrMPFR@reddit

NVIDIA is working with MS towards standardization in SM 6.10. Same applies to RTX Mega Geometry.

You can do inline stuff as an exclusive feature so it has to be vendor agnostic.

[-]

TurtleCrusher@reddit

It’ll needlessly be “proprietary” too. Turns out Physx ran best on AMD VLIW4 architecture, years after nVidia acquired physx.

[-]

sabrathos@reddit

Have you looked into what neural texture compression is? It's just running tensor operations from a shader. Pre-bake a small NN using Slang for your texture, and then evaluate it using hardware-accelerated FMAs at runtime.

There's no proprietary API. DirectX 12 added support for cooperative matrix/vector operations from within shaders. AMD and Intel both support it.

Nvidia incubates things in NVAPI to start, sure, but then has been consistently working with Microsoft and Khronos to standardize the APIs. Same with shader execution reordering, which is standardized now. Same with "RTX Mega Geometry", which is just granular cluster-level BVH update streaming for virtual geometry, which is coming to D3D12 this summer.

I'm not one to glaze Nvidia, but there's no proprietary black-box tech here. That's currently only with DLSS (which luckily can just be drop-in replaced with FSR). Everything else is hardware-accelerated and driver-supported extensions that are all generally useful and upstreamed.

[-]

trashk@reddit

Physx was an independent company that was bought by NVIDIA, not a core invention.

[-]

NapsterKnowHow@reddit

Basically Direct Storage in a nutshell. Sure it can run on PC but it's nowhere near as well optimized as it is on PS5

[-]

Greedom619@reddit

Of course it will. How will Nvidia make money if they allow older gpus. I bet they are focusing on this to lower overhead costs of the next gen gpu's in order to use less ram in the cards and data centers.

[-]

MrMPFR@reddit

All of it is getting standardized in SM 6.10 shipping EoY 2026.

This stuff won't be NVIDIA exclusive.

[-]

AsrielPlay52@reddit

for RTX series cards

the 20 and 30 get Decompression on Load

basically, smaller game size and that's it

For 40 and 50, you get real time decomp

[-]

Due_Teaching_6974@reddit

Also that AMD and intel will have to develop their own version of this tech, otherwise it will go the way of PhysX

[-]

random352486@reddit

Other way around, AMD and Intel will have to develop their own version or they will go the way of PhysX given Nvidia's current marketshare.

[-]

Seanspeed@reddit

AMD has been working on this for years.

[-]

Due_Teaching_6974@reddit

Doesn't matter, as long as PlayStation and Xbox use AMD components this tech will be implemented, which may take 7 or 8 years minimum

[-]

Kryohi@reddit

The good news is that RDNA5 in the PS6 will fully support NTC

[-]

GARGEAN@reddit

> this tech will never be implemented

Just like DLSS wasn't because it is NVidia-locked?

[-]

spazturtle@reddit

Implementing DLSS doesn't stop the game running on other brands GPUs, this does.

[-]

GARGEAN@reddit

>this does.

How?..

[-]

random352486@reddit

Oh right, forgot about consoles. My bad.

[-]

MrMPFR@reddit

No this is vendor agnostic similar to RTRT through DXR. MS is standardizing it in SM 6.10.

[-]

kinkycarbon@reddit

What I’m getting is Nvidia refining their work. This stuff was published their paper in 2023.

[-]

MrMPFR@reddit

Yeah some old stuff. It's currently in 0.9.2 beta so that's why we haven't seen any game adoption yet.
Neural Materials unfortunately still MIA and zero game with NRC outside of RTX Remix projects.

[-]

binosin@reddit

NTC adds its own compression scheme so yes, it would need deep integration during development to get maximum returns. There isn't baked hardware decompression like most compressed formats (BCn), every time a texture is needed you'll either need to fully decompress it in memory (for slower GPUs) or run inference per sample. Both stuff that could be abstracted away but decisions that would need to be made early on, NTC is not free.

It's hard to know the performance profile of this technique. On older hardware, you probably won't be using it at all. The NTC SDK recommends older hardware use BCn conversion (so you only get disk space savings, still valid). There's nothing stopping a game just decompressing all textures at first boot and running like normal - if NTC can reach real time speeds, this wouldn't be that slow even on older hardware. A well designed streaming solution would retain NTC, slowly decode higher mips over time as new textures are loaded and you'd be none the wiser other than a few less frames and blurriness, hopefully. They've validated it functioning on a good array of older hardware.

The full inference on sample method is recommended starting RTX4000+ and even then you'll be needing to use TAA and stochastic sampling (so probably DLSS) because it's expensive to sample. But with the memory savings you could probably do some virtual texturing to cache the texture over time, reducing cost. The challenge is keeping sample count low - it would get expensive fast if you were trying to overlay detail maps, etc. It's early days but the groundwork is there.

A big question is how this runs on other vendors. It can use the new cooperative vector extensions so should be fully acceleratable on Intel (and AMD, someday). But there's only recommendations for NVIDIA right now and a DP4a fallback.

[-]

MrMPFR@reddit

This got me thinking Sony could technically offer NTC textures for the PS6 and PS6 handheld versions. Just augment existing Kraken pipeline and decode to BCn (on load) when textures are needed. Otherwise I can't see how they'll be able to sell shitty 1TB PS6, but this should be an effective storage multiplier.

[-]

hodor137@reddit

A big question is how this runs on other vendors

Nvidias innovations are certainly great, but the endless vendor specific stuff is really unfortunate

[-]

CheesyCaption@reddit

If they were trying to innovate and make things industry standard at the same time, they're idea would die by committee.

It's much better, even for open standards, for Nvidia to show a new feature to consumers and then for the AMD gpu owners to ask for that feature. If Nvidia hadn't made gsync, freesync or hdmi vrr would have never happened.

Look how long it took for Freesync and vrr to happen with an existing and proven technology to use as an example and imagine what a shitshow it would have been if Nvidia tried to develop those standards as open without a proven example to work from.

[-]

spazturtle@reddit

DisplayPort Adaptive Sync was already in development, Nvidia just took the draft and added DRM and called that G-Sync. They didn't developed it themselves.

[-]

GARGEAN@reddit

NVidia literally pushes most of those things into coop vectors by working with Microsoft. OMM and SER? Basically made by NVidia, included into SM 6.9. Megageometry? Made by NVidia, included into Coop Vectors. NTC? Made by NVidia, included into Coop Vectors. Neural shaders/materials? You got the idea.

[-]

nittanyofthings@reddit

It won't. The only compatibility story is to have two rendering engines. Maybe two entirely separate asset downloads. At least path tracing was working on the same material assets from disk.

[-]

StickiStickman@reddit

Why are you just making shit up if you have absolutely no idea what you're talking about?

Reading this thread as a game dev who's doing a bunch of graphics programming is so painful.

[-]

CMDR_kamikazze@reddit

That's the best part, it won't. None of this will work on older hardware.

[-]

azn_dude1@reddit

Think even longer term though. The benefit is that when today's hardware is 10 years old, they might not need as much VRAM to run newer games.

[-]

evangelism2@reddit

good question, but even if the answer is 'like shit' its not worth holding up progress

[-]

shing3232@reddit

They can't .

[-]

Jeep-Eep@reddit

Yeah but how would that perform in real world use cases? That's what comes to mind when I see this - is it gonna be cannibalizing card resources for render for decompression?

There's also a vaguely noticeable artifact to it, although not nearly as gross (in either sense) as the risible DLSS 5. I'd have to see it under a more realistic real world use case to be impressed tbh.

[-]

Sopel97@reddit

what artifact are you talking about?

[-]

doscomputer@reddit

in every example from the paper the compression method is lossy and loses sharp detail compared to native

its very obvious

[-]

Jeep-Eep@reddit

yeah, both in texture detail and the color.

[-]

Sopel97@reddit

method is lossy

yes, that's why it's being compared to BCn

[-]

Jeep-Eep@reddit

There's something off about the colour balance of the decompressed pictures but I can't put my finger on what.

[-]

Sopel97@reddit

if you mean the example with the table then it's because there's more detail

[-]

Sevastous-of-Caria@reddit

Gimping consumer cards even more so that one test render at davinci resolve lags even more on the timeline now I can record at 4k.

Why not use the spare tensors to accelerate dlss performance presets with vram on hand. Or just put more cuda cores giving us pure performance figures more rather than stagnating since ada lovelace. Oh yea AIAIAI... 4x higher resolution output of the texture stream better not yassify like dlss5 witch I have a suspicion of it doing.

[-]

tapper82@reddit

Hi I think this is an AI tripping again!

[-]

Sopel97@reddit

what the fuck are you babbling about

[-]

frazorblade@reddit

He’s referring to scary yass witches obviously.. keep up man

[-]

kuddlesworth9419@reddit

I just wish we moved towards vendor agnostic features instead of proprietary features. Having everything split off and everyone doing their own stuff isn't terribly helpful for anyone.

[-]

Loose_Skill6641@reddit

As far as NVIDIA is concerned everything they invent is the new industry standard, and Microsoft seems to agree because every time NVIDIA releases new features, a few months later Microsoft integrates those same features into official DirectX releases

[-]

kuddlesworth9419@reddit

Isn't DirectX years behind Nvidia, AMD and Intel at this point with some of the features they offer? It would be nice if they could all just one on one standard instead of running off to do their own shit. It would be even better if they could do it with Vulkan instead of DirectX. It won't happen though.

[-]

Nuck_Chorris_Stache@reddit

You wouldn't download more RAM

[-]

StanGoodspeed618@reddit

6.5GB to 970MB is an 85 percent reduction in VRAM for textures alone. If this ships widely it means 8GB cards stay viable for years longer and devs can push texture quality way higher without the usual memory budget tradeoffs. The real question is how much tensor core overhead it costs at runtime.

[-]

Loose_Skill6641@reddit

my concern is what the textures look like in games and in motion.

When NVIDIA says "neural", what they mean is AI Generated. so neural compression is using AI generation for it. That could mean that there is room for errors that could result in texture being changed when they are decompressed, same as we see in DLsS5 footage

[-]

Nuck_Chorris_Stache@reddit

When NVIDIA says "neural", what they mean is AI Generated. so neural compression is using AI generation for it.

"Neural" doesn't always mean generative AI. It often just means algorithms that are a result of a machine learning process, as opposed to manually programmed algorithms.

[-]

Mageborn23@reddit

Everyone talking shit about nvidia Dlss when they actually cooked with this shit. I am all in.

[-]

doscomputer@reddit

the compression is still very noticable and IMO not that good, the paper uses an example that is very small in size but detailed like a house, seems like this tech is only good for improperly created assets

[-]

Reporting4Booty@reddit

The DLSS5 example in the actual article still looks like shit. The woman's face looks like it was pasted on from an overphotoshopped Instagram photo.

[-]

hepcecob@reddit

Who is this "everyone" you talk of? Only complaints I saw were DLSS 1 and 5

[-]

Qsand0@reddit

Redditors have a hard on for strawmen. Pathetic karma farmers

[-]

mecha-verdant@reddit

Now, here's the real kicker: what then will be the minimum specs to run GTA6 \~60fps on 1080p?

[-]

f1rstx@reddit

4060

[-]

ProZoid_10@reddit

Ps5 pc equivalent

[-]

Reaper_1492@reddit

In any case, this is pretty wild.

[-]

StanGoodspeed618@reddit

The 6.5GB to 970MB compression is impressive but the real story is what this does for the hardware design constraints. Smaller VRAM footprint means GPU makers can either cut costs on memory chips or use the freed bandwidth for other workloads. Tensor cores doing double duty on decompression is clever engineering.

[-]

Jonny_H@reddit

The issue is that the bandwidth use massively increases with techniques like this - the model required to actually draw the texture needs to be read every block. It effectively trades higher bandwidth and shader use for lower vram - it's not actually a bandwidth saving.

If the model is small enough they could have that model in on-chip sram on future hardware - but then that sram could have been used as general cache instead, so still not an "obvious" win.

[-]

KanedaSyndrome@reddit

Now this might be a worth while feature as long as detail and creative intent is not compromised

[-]

StanGoodspeed618@reddit

6.5GB down to 970MB is an 85 percent reduction. This is how you make 8GB cards viabe again. Neural compression is the real unlock for next-gen VRAM constraints.

[-]

jenny_905@reddit

Shame PCMR leaked into here, it's an interesting development but of course finding intelligent discussion on Reddit is impossible these days

[-]

Hsensei@reddit

You should find a community that does that. Especially since you are so displeased with this one.

[-]

Jumpy-Dinner-5001@reddit

This keynote should have been held before the DLSS 5 launch.

[-]

Xelanders@reddit

This is the sort of thing they should be talking about to begin with, using AI as a compression technique to optimize games for lower end hardware, not using AI to change and “improve” the game’s art direction.

[-]

glenn1812@reddit

Almost every single person who has or had a bad opinion about AI including myself has always had most critics for generative AI. Bullshit that actually wastes electricity. People generating garage video and images of politicians and regular people. AI was used in multiple applications before and most people didn’t even know about it till image generation started.

[-]

MrMPFR@reddit

They did at CES 2025 and GDC 2025. A shame DLSS5 has tainted neural rendering. Has nothing to do with this stuff, which is amazing, controllable and deterministic.

[-]

yamidevil@reddit

This and mega geometry. I am most amazed at mega geometry as path tracing enjoyer

[-]

MrMPFR@reddit

Mega Geometry and all this neural rendering stuff getting standardized into SM 6.10. Shipping in late 2026.

[-]

ssongshu@reddit

“DLSS5” should just be a FreeStyle filter

[-]

JackSpyder@reddit

Yes its far more interesting. And has wider potential.

This, with asset duplication gone (pointless now days) could bring life to older GPUs, massive help drive space.

[-]

ghulamalchik@reddit

I love the idea but as long as it's tied to a specific hardware then it's bad. Nvidia is making the gaming industry vendor exclusive and closed. This is not the future we want.

[-]

Dr_Icchan@reddit

I wonder what they think they'll gain from this? If they make a GPU with one fifth of normal VRAM, no one is going to buy it because it'll not work with any other workload.

[-]

Darrelc@reddit

If they make a GPU with one fifth of normal VRAM, no one is going to buy it because it'll not work with any other workload.

I think you underestimate the fervor of nvidia AI bros on reddit

[-]

GARGEAN@reddit

Because instead of working with fifth of normal VRAM for textures, it can work with half of normal VRAM for textures while having greatly increased textures quality.

[-]

guyza123@reddit

They can keep the same amount of VRAM for old games, but allow new games to still look better at the same VRAM cost or less.

[-]

Darrelc@reddit

Deep learning VRAM Fuck yeahhhhhhhh

[-]

Mrgluer@reddit

thinking the goal will be to distinguish gaming and workstation cards more by reserving the memory for workstation cards and gaming chips will probably stagnate on VRAM.

[-]

MiloIsTheBest@reddit

I'm hoping that's not the case but I think that the gaming GPUs will still have a smaller vram allocation than workstation ones, like always, but the main thing is we'll be on n -1 nodes now while Jensen's "good customers" are on the new nodes.

[-]

Mrgluer@reddit

i personally don’t mind. not everybody needs 16 gb of vram.

[-]

No-Improvement-8316@reddit (OP)

Apparently the automod doesn't like the summary... Let's try again:

NVIDIA’s GTC 2026 talk showed that neural rendering goes beyond DLSS 5 by integrating small neural networks directly into the rendering pipeline. Instead of only enhancing the final image, these networks handle tasks like texture decoding and material evaluation, improving efficiency.

A key example is Neural Texture Compression (NTC), which reduced VRAM usage from 6.5 GB to 970 MB while maintaining similar image quality and even preserving more detail at the same memory budget. This could lead to smaller game sizes, faster downloads, and better asset quality on existing hardware.

NVIDIA also introduced Neural Materials, which compress complex material data into a lighter format processed by neural networks. This reduced data complexity and improved rendering performance, achieving up to 7.7× faster rendering in tests.

Nvidia's video "Introduction to Neural Rendering":

https://www.youtube.com/watch?v=-H0TZUCX8JI

[-]

Alphasite@reddit

How much memory does the decompression model need?

[-]

StickiStickman@reddit

I can't find a specific number for the model, but it can't be very big if it's used for real time interference. If DLSS 4.5 is anything to go on, maybe a dozen MB or up to 100MB.

[-]

Sopel97@reddit

the model is outlined in the original paper and is nothing like you conceptualize

[-]

StickiStickman@reddit

Do you mean

Our network is a simple multi-layer perceptron with two hidden layers, each of size 64 channels. The size of our input is given by 4𝐶0 + 𝐶1 + 12 + 1, where 𝐶𝑘 is the size of the feature vector in grid 𝐺𝑘 . Note that we use 4× more features from grid 𝐺0 for learned interpolation, 12 values of positional encoding and a LOD value

Which would be the texture itself.

Because from what I can see the model for encoding and decoding isn't described in detail?

[-]

Sopel97@reddit

Which would be the texture itself.

Yes, that's the point. The compressed representation is a machine learning model and a set of input features.

from what I can see the model for encoding and decoding isn't described in detail?

There is no "model for encoding". The encoding, i.e. the compression, is the process of training the network and the feature pyramid for a given texture.

[-]

StickiStickman@reddit

I can't find anything specific, but I assumed they have a model for fast conversion for BCn. I guess they can also just "brute force" it without it being that much slower.

[-]

Alphasite@reddit

That’s a lot smaller than I expected for an image model. But I guess it has to run in realtime so it makes sense. Cant blow the frame time budget otherwise what’s the point.

[-]

StickiStickman@reddit

I'm not sure what you mean with patch. Do you mean batch?

For what it's worth, the images themselves will be tiny models that get run. And for model interference, since it's read only, you can batch as many as you like.

[-]

Sopel97@reddit

the decompression model is part of the compressed data

[-]

YourVelourFog@reddit

Sounds like my 4GB card will live again!

[-]

Kosba2@reddit

Every time we improve our technology to be able to accomodate more wonderful art, we polish worse piles of shit to the bare minimum.

[-]

Calm-Zombie2678@reddit

Rtx 7080 required but its only 4gb

[-]

Capillix@reddit

“Smaller game sizes, faster downloads…” - call of duty: “Hold my beer”

[-]

havasc@reddit

They're going to use this to justify releasing new cards priced at $1000 with 2gb of VRAM.

[-]

Zueuk@reddit

the things you do to justify selling GPUs with only 8 Gb of VRAM

[-]

Marble_Wraith@reddit

DLSS 5 (Neural rendering) can eat a dick, but this "Neural compression"... this is genuinely fucking cool!

Kinda reminds me of advanced video codecs, offering the same fidelity with smaller bandwidth / filesizes.

Essentially it's a tool for game production pipelines. Now devs only have to worry about authoring the high fidelity assets and art direction, and leave most of the optimization to this.

I hate AI, but i also gotta give the devil his due.

[-]

crisorlandobr@reddit

Doubt: Is this gonna save VRAM to local AI models or its only for games for now ?

[-]

jhenryscott@reddit

With Nvidia you gotta trust half of what you see and none of what you hear

[-]

Sj_________@reddit

I wonder if this can get any life back to my 4060, or it would be a 50 or 60 series exclusive feature

[-]

StickiStickman@reddit

Minimum: Anything compatible with Shader Model 6 (will be functional but very slow)
Recommended: NVIDIA Ada (RTX 4000 series) and newer.

[-]

GARGEAN@reddit

Available on all RTX GPUs, but unadvisable for inference on sample (basically the thing to conserve VRAM) on 20 and 30 series. 40 should have some support.

[-]

angry_RL_player@reddit

fake vram

enjoy paying 1k for 1 4gb card

[-]

crshbndct@reddit

So now they are hallucinating geometry with DLSS5, and hallucinating textures too?

They really wan to push us towards one sentence game prompts and it generates the whole thing on the fly don’t they?

[-]

dparks1234@reddit

RTX 2060 to get a second wind in 2033

[-]

kaden-99@reddit

2060 was the real fine wine GPU

[-]

AsrielPlay52@reddit

According to the NTC SDK, the 20 series at least able to decompress while loading, so smaller game sizes, but not real time

[-]

GalvenMin@reddit

AI needs all the RAM to make sure you won't need RAM. Just trust us.

[-]

FieldOfFox@reddit

Yeah so this is ACTUALLY a good advancement.

I mean the quiet-part is that they can continue to stagnate VRAM and claim "it's not needed" and keep high RAM cards for enthusiast tier and datacenter, but at least this levels the "what can I play on my old-ass GPU" field.

[-]

Seref15@reddit

I bet this became an internal priority to put less vram on gaming cards to save modules for ai cards

[-]

jaypizzl@reddit

Nvidia screwed up real bad by showing DLSS 5 too early. They forgot how much average people fear change, especially when they don’t understand it. They should have taken more care to make it seem less threatening. Better compression? Faster rendering? Those are less scary-sounding ways to explain the benefits.

[-]

mrfixitx@reddit

Just another way for NVIDIA to keep telling us 8GB of VRAM is all we need on their cards.

Seriously though it's impressive but if it only works as part of DLSS 5 I doubt that is going to change install sizes since game devs are not going to want to lock out AMD video cards and steam deck owners.

If this was an open source solution not tied to NVIDIA hardware it would be amazing especially for lower spec machines.

[-]

GARGEAN@reddit

>If this was an open source solution not tied to NVIDIA hardware

It is part of Cooperative Vectors. It's not tied to NVidia.

[-]

ResponsibleJudge3172@reddit

More of like how rtx mega geometry runs on DXR 1.1 layer

[-]

GARGEAN@reddit

Yup. And as long as other vendors have appropriate hardware - they can use any of those things trough that common API

[-]

Creepy_Accountant946@reddit

No one gives a shit about open source in real life except weirdo redditors

[-]

Devatator_@reddit

Half the fucking internet is built on open source. A lot of stuff you use daily too. You don't care but please don't think for a second that no one does

[-]

DerpSenpai@reddit

Not really. Currently 16GB of RAM itself will cost you 200-250$, because it's GDDR6 even higher. Neural Rendering will let us continue to scale while in this memory armageddon without going over to 24-32GB for mainstream cards.

[-]

Seanspeed@reddit

Nvidia were working on all this stuff well before the current memory crisis kicked in.

The crisis probably wont last forever, either.

Currently 16GB of RAM itself will cost you 200-250$, because it's GDDR6 even higher.

Nvidia aren't paying anywhere near retail prices for memory, come on now. lol Even if this tech does have some positive implications for the future, we all know Nvidia is going to exploit it in terms of pricing and product segmentation.

[-]

DerpSenpai@reddit

Yes they are. Smaller OEMs pay even more than that. This is what big OEMs are paying for these parts.

[-]

Seanspeed@reddit

You're nuts if you think a giant memory consumer like Nvidia are paying retail prices.

[-]

Bob4Not@reddit

This could be cool! I just hope it doesn’t conform various art styles into a trained model’s

[-]

Youfallforpolitics@reddit

Ntc requires sampler feedback if I'm not mistaken...

[-]

MrMPFR@reddit

Only Inference on Feedback.
on load can run on basically all cards
on sample is inference and very matmul hungry.

[-]

Autumn-Bloom@reddit

Amazing. Step by step they are adding more features to their subscription streaming service and destroy the home pc market with insane component prices. No thanks Novidia

[-]

GARGEAN@reddit

>Step by step they are adding more features to their subscription streaming service

What on earth does this have to do with this GTC presentation talking about new rendering techniques? This isn't GN comment section, mind you.

[-]

No-Improvement-8316@reddit (OP)

NVIDIA’s GTC 2026 talk showed that neural rendering goes beyond DLSS 5 by integrating small neural networks directly into the rendering pipeline. Instead of only enhancing the final image, these networks handle tasks like texture decoding and material evaluation, improving efficiency.

A key example is Neural Texture Compression (NTC), which reduced VRAM usage from 6.5 GB to 970 MB while maintaining similar image quality—and even preserving more detail at the same memory budget. This could lead to smaller game sizes, faster downloads, and better asset quality on existing hardware.

NVIDIA also introduced Neural Materials, which compress complex material data into a lighter format processed by neural networks. This reduced data complexity and improved rendering performance, achieving up to 7.7× faster rendering in tests.

[-]

No-Improvement-8316@reddit (OP)

And the video "Introduction to Neural Rendering":

https://www.youtube.com/watch?v=-H0TZUCX8JI

[-]

AutoModerator@reddit

Hello No-Improvement-8316! Please double check that this submission is original reporting and is not an unverified rumor or repost that does not rise to the standards of /r/hardware. If this link is reporting on the work of another site/source or is an unverified rumor, please delete this submission. If this warning is in error, please report this comment and we will remove it.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.