Are x86_64 CPU Core ID Numbers Reported by the Kernel the Physical Core IDs?

Posted by mkvalor@reddit | linux | View on Reddit | 3 comments

APOLOGIA: As measured below, "low" reported core-to-core latency is 41 ns; "high" latency is 62ns. So feel free to decide I'm silly for caring about these deltas and stop reading here. 😂

tl;dr - for "reasons", I'm trying to be a control freak about which processes get pinned to which cores on a single-die, multicore CPU. But I suspect I'm getting "fooled" by the Linux kernel's enumeration of the core IDs, such that it's giving me virtual IDs or at least different IDs across either simple reboots or reboots with new kernel updates. Is this true? And if so, can I configure anything to force the kernel to always give me the pure physical CPU IDs?

Rationale: CPUs of the type I'm using (see below) have many cores with an internal set of ring busses that connect them. This puts some cores much 'nearer' on the "bus line" to the on-die PCI modules (signals from e.g. Ethernet adapters, etc). So I'd like to pin my processes which consume external ethernet traffic closer to the PCI modules on the ring bus. Then, I'd like to pin other processes (which manipulate this ingested information) to cores closest to those initial cores, etc, etc.

BACKGROUND: I'm running current Fedora 40 on a system with a single Intel Xeon Scalable v3 Gold 6338 CPU with 32 physical cores / 64 hw threads (RAM & storage are are normal & unremarkable). I tend to run the system package updater once per week for all software and I reboot the system if the kernel gets updated. There are probably only about 20 additional packages installed on the system besides base, including things like vim, make, GCC - so it's pretty minimal.

MY EFFORTS: I'm fiddling around with some soft 'realtime' programming by using shell scripts to pin processes (single-threaded programs I have written in C or rust) to core IDs using 'taskset' and then setting the process scheduler and priorities for these running pids using 'chrt' . So far, all of this has worked as expected. I don't use any kernel API calls in my programs to change cores or scheduling or anything like that. When I look at top or htop (actually I like to use 'atop'), I see that my processes are indeed pinned to the cores I specified.

HOWEVER: I recently became aware of independent projects which claim to measure core-to-core latencies on CPUs, so I decided to try one published on GitHub: andportnoy/core-to-core-latency. It spit out a useful CSV file with my core-to-core latencies. I was surprised to discover that the program showed higher latencies between core IDs which should be much closer to one another on the internal ring buses. But I'm no hardware guru, so I thought, "Well maybe that's just how it works, or maybe there's a bug in the program, or maybe something else is going on I simply don't know about." By the way, the system was reasonably idle whenever I ran this.

BUT THEN: I used the Fedora package manager to update my system which included a kernel update and I rebooted the machine to pick up the new kernel. For no particular reason, I re-ran the above program afterwards and found that it spit out a different set of core latency relationships in the CVS file output. But the pattern of the latencies was surprisingly similar; only the core IDs had changed. In other words: instead of (earlier run) physical core 29 with its associated hyperthread 61 showing the lowest latencies to many (but not all) cores near it on the physical ring bus -- this time, it was core 17 with associated hyperthread 49 which showed the very same lower latencies to new core IDs somewhat near it on the bus. I got a bit wise, and decided to make a shell script to run the 'measure' program (produced by the GitHub project) multiple times in a row, sending the output to /dev/null except for the final run. All the latencies reported in the final CSV output file did go down modestly, but the "new" basic relationship of lower latencies to core 17 remained.

THE QUESTION(S): When I use 'taskset' on a system with a single CPU that has multiple cores on a single die, may I assume that the core ID numbers I pass in as parameters should map to the actual physical core IDs on the die? If not (by default), is there some combination of configuration setting or kernel boot parameter, or kernel build configuration that could force this to be true? Finally (for extra credit 😁), would anyone with knowledge of Intel hardware who took a few moments to examine the 'measure.c' source file in the GitHub repo care to offer an opinion as to why, for example, cores 0 - 6, which are fairly close to one another on the CPU ring bus might report average to high latencies among each other, compared with some other core much closer to the center or the far right of the die seemingly exhibiting magically lower latencies to many of its nearby neighbors?

Extreme gratitude to all who made it this far in my post. I appreciate the privilege of having access to this community!