I think they are suggesting that you do not do what they are doing bc it is very dangerous.
It makes sense in their context but for more user applications (or kernel applications for that matter) it's better to not allow for the race conditions what they're doing could potentially allow if they get it wrong.
Great writeup. The vDSO bypass + cached mult/shift trick is clever, but the part that should be more widely appreciated is the methodology in "Measuring tails" — separating the three update states using perf counters is not really highlighted in most benchmarks.
One thing I'd add: on AMD Zen the rdtscp vs lfence;rdtsc tradeoff isn't quite the same as Skylake — from what I've seen lfence is much cheaper there, so the serialization overhead profile flips. Worth measuring if anyone tries to port this. Also curious whether you tested with CLOCK_MONOTONIC_RAW as a sanity check on the kernel's frequency tracking, since the whole point of VdsoCacheTimer's accuracy edge is trusting NTP-corrected mult.
sastdast@reddit
It says almost no one should do this. Does that mean the improvement shouldn’t be used or no one should be doing it the old way?
daidoji70@reddit
I think they are suggesting that you do not do what they are doing bc it is very dangerous.
It makes sense in their context but for more user applications (or kernel applications for that matter) it's better to not allow for the race conditions what they're doing could potentially allow if they get it wrong.
ng37779a@reddit
Great writeup. The vDSO bypass + cached mult/shift trick is clever, but the part that should be more widely appreciated is the methodology in "Measuring tails" — separating the three update states using perf counters is not really highlighted in most benchmarks.
One thing I'd add: on AMD Zen the
rdtscpvslfence;rdtsctradeoff isn't quite the same as Skylake — from what I've seenlfenceis much cheaper there, so the serialization overhead profile flips. Worth measuring if anyone tries to port this. Also curious whether you tested withCLOCK_MONOTONIC_RAWas a sanity check on the kernel's frequency tracking, since the whole point ofVdsoCacheTimer's accuracy edge is trusting NTP-correctedmult.