Raw FPS averages are inherently flawed
Posted by Bluedot55@reddit | hardware | View on Reddit | 10 comments
To make a simple example, lets take 2 hypothetical GPUs in 2 games.
- | GPU 1 | GPU 2 |
---|---|---|
Game 1 | 100 fps | 50 fps |
Game 2 | 250 fps | 500 fps |
Total average fps | 175 | 275 |
In this example, each GPU had 1 game where it was 100% faster then the other gpu, but by virtue of one game being lighter to run, and running significantly faster on both GPUs, that game has an outsized effect on the average. Beyond that, I believe most people would agree that the difference between getting 50 and 100 fps in a game is far more noticeable then getting 250 vs 500.
Frame time averages
There's a few ways to give a more accurate number here. An argument could be made that rather then the averages being done of FPS, an average of frame times would give a better representation of the relative performance. This inverts the weighting, making each percentage difference matter more when the FPS is lower, meaning a difference between 45 and 60 fps is more impactful then a difference between 150 and 200.
Relative averages
Alternatively, the overall average could be a average of the relative performance of the products, so rather then a set FPS, each game was scored as a percentage of the highest performing product. This would guarantee that every game gets an equal weighting in the end result, so a difference between 45 and 60 in one game is balanced out by a difference of 200 vs 150 in another.
9070xt review example
For a real world example of how this would effect comparisons, I ran the numbers with the different methods using Techspot/HWUnboxed's review of the 9070xt, and how it compares to the 5070ti in 1440p. Numbers are measured as a percentage of the performance of the 5070ti.
Foo | Relative performance |
---|---|
HWUnboxed's average | 94.4% |
raw fps average | 91.8% |
frame time average | 96% |
relative performance | 95.4% |
HWUnboxed's RT average | 79.1% |
raw fps RT average | 80.4% |
frame time RT average | 57.2% |
relative RT performance | 73% |
I'm not quite sure why my raw averages don't line up with what HWUnboxed themselves had for the multi-game averages numbers, maybe they do some sort of weighting in a similar manner.
Regardless, looking at these, the frame time averages show a smaller gap between the cards when you are looking at non ray-traced titles, but when you add ray-tracing, the gap more then doubles from what the regular average would suggest. With different GPUs and CPUs performing differently in different sorts of games, I think an approach like this may be valuable for getting a better feel for how products actually compare to one another.
TL:DR
FPS averages massively reward products that do very well at light games, even if they do worse at heavier games with lower average FPS.
DuranteA@reddit
I'm not sure why this is downvoted. It's true and a valid concern.
Personally, I think merely eliminating the over-representation of high-FPS games by e.g. using relative performance or a geomean, while obviously much better, isn't going far enough. Games running at low framerates should actually have a larger influence on the overall score since the differences at those framerates have a larger impact on overall playability. I'm not sure just averaging frametimes is the best approach to achieve that, or if a more proactive method would be better (e.g. defining a minimum goal framerate and having a defined drop-off function defining the significance of performance beyond that).
JuanElMinero@reddit
The concern is valid, but it's also a few years late.
This issue OP presents has already been solved and anyone worth their salt uses geomean these days.
RealThanny@reddit
Your example is beyond bizarre, because each GPU is twice as fast as the other in two different games.
If you use the geometric mean, which is the only proper way to average disparate values like that, you end up with precisely the same number for each GPU.
Furthermore, your idea that frame times are somehow different is just wrong. You get precisely the same performance difference whether you use frame rate or frame time. Unless you do something silly like present one number as a percentage faster and the other number as a percentage slower.
Pamani_@reddit
When they do the arithmetic average of frametimes, it's effectively the same as doing the harmonic mean of frame rates.
Voodoo2-SLi@reddit
The presentation of both possible calculation methods (arithmetic mean & geometric mean) immediately clarifies the issue: The use of the arithmetic mean in this type of averaging is fundamentally wrong because it significantly overweights the values of high-fps games. In the example, “Game 2” is included in the overall calculation with roughly three times the weighting. However, this is of course not the objective; each game should be equally weighted in the overall calculation. The only possibility for this is the geometric mean.
-|GPU 1|GPU 2 |:--|:--:|:--:| Game 1|100 fps|50 fps Game 2|250 fps|500 fps Total average fps (arithmetic mean)|175|275 Total average fps (geometric mean)|158.11|158.11
The same geometric means is now widely used (exceptions unfortunately still prove the rule).
This is very easy to explain: HWUnboxed uses a geometric and not an arithmetic mean.
Pamani_@reddit
This problem was brought up to HUB a few years ago and since then they moved to a geomean, which solves the problem. I also suggest you take a look at how Daniel Owen presents his data.
Soulspawn@reddit
This is the answer.
gumol@reddit
Depends on what average you choose. This entire discussion is solved if you use geometric mean as your average.
(Arithmetic mean isn't the only average)
EndlessZone123@reddit
Its been pretty regularly mentioned in CPU benchmarks though? Every CPU review mentions frametimes and 1% lows because usually CPU's have a higher effect on micro stutters (unless vram).
CanIHaveYourStuffPlz@reddit
Do you also calculate into your equation 1% lows?