5060Ti vs 9060XT efficiency testing in Cyberpunk
Posted by bunihe@reddit | hardware | View on Reddit | 5 comments
Cards are Sapphire Pulse 9060 XT 16GB and Zotac Twin Edge 5060 Ti 16GB
Both cards are undervolted, so running at their best-case scenarios, 9060xt has VRAM overclocked to 23Gbps (and a gradual increase before that to get there, all using Fast timings to hopefully reduce bandwidth bottleneck) for the 188W datapoint. 5060Ti sees 28Gbps/29Gbps/30Gbps/31Gbps steps for the 4 points, running at 0.72V, 0.82V, 0.85V, and 0.9V.
Upscaling to 4k is set to Balanced on both cards, DLSS 4 will look better than FSR 4.1 when upscaling from the same render resolution, whether it'll be noticeable is dependent on you. Ray Reconstruction is turned off for both cards.
Platform is 9800X3D paired with 64GB of Hynix DDR5 for the 5060ti but 96GB of Samsung DDR5 for the 9060xt. Unfortunately, I don't have the 9060xt in hand anymore to retest using Hynix, but at 4k the platform shouldn't be bottlenecking the GPU
Using the Ultra preset (raster only, upscaling set to the cards' native ML upscaling):
| Power (W) | 9060xt (fps) | 5060ti (fps) |
|---|---|---|
| 118 | 44.23 | |
| 136 | 46.1 | |
| 152 | 47.84 | |
| 170 | 48.94 | |
| 187 | 50.31 | |
| 70 | 39.85 | |
| 101 | 48.07 | |
| 109 | 49.88 | |
| 125 | 52.81 |
Using my manual Optimized settings without RT settings on:
| Power (W) | 9060xt (fps) | 5060ti (fps) |
|---|---|---|
| 118 | 68.14 | |
| 136 | 70.62 | |
| 152 | 72.79 | |
| 170 | 73.8 | |
| 187 | 75.41 | |
| 129 | 84.7 | |
| 114 | 80.49 | |
| 105 | 77.29 | |
| 75 | 64.98 |
Using Optimized settings with RT enabled:
| Power (W) | 9060xt (fps) | 5060ti (fps) |
|---|---|---|
| 118 | 31.97 | |
| 136 | 34.17 | |
| 152 | 35.9 | |
| 170 | 36.62 | |
| 187 | 37.78 | |
| 76 | 34.42 | |
| 110 | 40.82 | |
| 120 | 42.51 | |
| 138 | 44.69 |
The Optimized settings I'm running with:
| Setting | value |
|---|---|
| Ray Tracing | ON |
| Ray-Traced Reflections | ON |
| Ray-Traced Sun Shadows | OFF |
| Ray-Traced Local Shadows | OFF |
| Ray-Traced Lighting | Medium |
| Path Tracing | OFF |
| Crowd Density | High |
| FOV | 80 |
| Everything else under Basic | OFF |
| Contact Shadows | ON |
| Improved Facial Lighting Geometry | ON |
| Anisotropy | 4 |
| Local Shadow Mesh Quality | High |
| Local Shadow Quality | High |
| Cascaded Shadows Range | High |
| Cascaded Shadows Resolution | High |
| Distant Shadows Resolution | High |
| Volumetric Fog Resolution | Low |
| Volumetric Cloud Quality | Off |
| Max Dynamic Decals | High |
| Screen Space Reflections Quality | High |
| Subsurface Scattering Quality | High |
| Ambient Occlusion | High |
| Color Precision | High |
| Mirror Quality | High |
| LOD | High |
I posted my graphs in another reddit post, I can't attach images to this post unfortunately.
cp5184@reddit
Don't nvidia gpus have much higher cpu overhead? I assume this doesn't account for that?
bunihe@reddit (OP)
No, not with a 9800x3d, but nowadays with a decently fast CPU that overhead is minimal anyway.
Crap-_@reddit
efficiency of the radeon card is still behind. for instance a max wattage 115w 5070 laptop will get near identical fps to a 9060xt.
Seanspeed@reddit
Good efficiency comparison, and very valid given these are both very similar chips, with 5060Ti being 188mm² N4, and 9060XT being 199mm² N4.
Even with Blackwell being a wet fart of an architectural improvement, Lovelace itself was already insanely efficient, so much so that even with major RDNA4 improvements, AMD is still playing catchup here pretty noticeably.
And I know it's not a major priority for most people, but I actually do quite value good efficiency. I hate the idea of my PC running like 400-500+ watts to game. Efficiency also enforces a kind of soft performance cap for a series/architecture as well. The more efficiency you have, the more top end performance headroom you'll have with high end parts. While AMD could certainly still beat say, a 5080, if they really wanted to with RDNA4, they would simply be incapable of matching a 5090 because of being more inefficient.
Next gen GPU's should bring more meaningful efficiency updates as they move to at, at the very least, some N3 variant.
bunihe@reddit (OP)
On my 4080 laptop (AD104 paired with 192bit of GDDR6), I found myself running the core at 0.8V 2130MHz with VRAM slightly overclocked to 18.5Gbps which got me about 90% of the stock performance (3080 desktop level) at about 60% of the power, hovering around 108W in games. It is really insane how much Nvidia improved in one generation, or maybe it could be due to how lackluster SF8 is and Ada is mostly a die shrink of that.
Unfortunately Ada seemed to run hard into the GDDR6X bandwidth cap, their 4070 Ti desktop cards clock pretty high but showed bad efficiency in comparison to 4070 Super. In the 4060 Ti's case 128bit of GDDR6 held it back so much that it shows basically 3060 Ti level of performance despite everything.
In an apples to oranges comparison, the 5060 Ti seemed to achieve similar performance vs the 0.7V 1770MHz mobile 4080 when run at 0.82V 2610MHz while consuming 115W, 20W more than the mobile 4080. Considering that the 5060 Ti has a dual rank GDDR7 layout, I think this is a respectable result that seemed to hint that this 36% smaller die can get ~85% as efficient for iso-performance, but of course as soon as I'm pushing 120W+ the lead for the mobile 4080 grows significantly.