This seems like a scheduling nightmare for Windows.
- 2 separate CPU dies with a presumably high core to core latency
- 3 different types of CPU cores
- LP-E cores being severely cut down and having even worse memory latency making them not comparable to regular E cores despite having the same architecture
- The problematic tile layout from Arrow Lake could potentially carry over
The neat thing about these 4 LPe cores is that they would be on their own separate CPU tile and when you're doing things like web browsing, writing emails, word docs or other light tasks, the I/O tile can completely power down the 2 8P + 16E CPU tiles to save power, increase battery life and reduce heat.
This was seen in it's first form in Meteor Lake (although it only had 2 LPe cores with an insufficient 2mb of L2 cache which made it really struggle to run background tasks like web browsers.
It was seen again in Lunar Lake. Intel increased core count to 4 LPe cores and 4mb of L2 which improve performance by at times 87% which allowed it to have skylake like performance which is more than enough to handle web browsing, word docs ect.
Sure I guess. I was thinking more for desktop use. The LP cores can be used for system processes like filesystem services and whatever -things that always need to be operating but away in the background to maintain the system.
It's worth noting that e cores actually perform better than p cores for workloads that require low power. Skymont performs roughly 20% better than raptor cove. My guess is those next new Arctic Wolf e cores are going to have a similar performance boost compared to Lion Cove.
It's been rumored that Intel is reworking their chiplets to fix the poor fabric latency so hopefully that's true. LPe Skymont on Lunar Lake only has 4mb of L2 but it performs as well as skylake along with having great efficiency due to a lack of L3 cache
Interesting if there's going to be LP cores on desktop, it never seemed like either intel on AMD particularly care about desktop power consumption, for both normal background usage or fully loaded. Would this be just an effect of them using the same tile between laptop and desktop, and the cores not having a high failure rate? And is software support there?
The core counts sound very nice compared to AMD's that are still refusing to bring their compact cores to mainstream desktop, but we'll have to see what intel actually puts in the CPU and whether AMD finally increases CCD core count (and what the incraese will be to) with their rumored CPU redesign
Sigh. I am an overall satisfied AMD customer, but I wish I had more than the 8 core options available for optimal gaming performance on any of their CPUs but the 7950X3D.
For my next CPU I hope either AMD offers me more cores without a performance penalty (7900X3D is only a 7600X3d in gaming) or Intel offers me the core count I want with better gaming performance and improved efficiency.
Though it doesn't look like either option is on the horizon in 2025.
For gaming and typical desktop usage, *very* little will use more than 8 performance cores unless you're explicitly multi-tasking high performance applications. It's just hard to parallelize the logic, sequencing, timing and data handling routines that live in CPU threads.
I guess maybe there might be a handful of games that can saturate more than 8 performance cores on a sustained basis but I've never encountered one. If you're running into problems make sure Windows or other background apps aren't doing random shit in the background that's sucking p-cores. That background stuff should be limited to e-cores but Windows core scheduling can do dumb stuff sometimes. You can just force those background apps off your p-cores or, even better, just make sure they don't run when you're using the computer. Get AutoRuns from SysInternals.com, it's free. If you don't stay on it, every damn desktop app you install (eg Acrobat, browsers, cloud sync, etc) will add a bunch of background tasks that run automatically on their own at random times to check for updates, phone home usage analytics, etc.
If you ask for games specifically, strategy games that utilize up to 32 threads. If you ask for productivity, for me i do math on CPU for data analysis. I will use as many cores as there are and ask for seconds.
As a 9950X owner, I’ve been really struggling to get anything to use these extra 8 cores beyond Cinebench and Prime95. Windows will only use those extra 8 cores if the other 8 are fully saturated and can’t take any more work at all, which basically never occurs with modern programs.
It would be a bit better if they could be powered up without bring up the whole CCD from a low power state, though I think for purely gaming a 10 core CCD would be the perfect spot. I have a 7600x myself and while I don't ever feel like I'm getting low performance, there are some general desktop things like unpacking that just take a bit longer than I'd want
I realized since the only game I'm playing right now is helldivers the 7600x is perfectly fine.
I have a home server where once in a while I need a bunch of cores but most of the time it's idle
They will suddenly update their benchmarks to discard clock speed and cache entirely, and increase their emphasis on multi-core score, and then call the Ryzen 7 11800X3D (or what ever it's gonna be called) a refresh of the 9800X3D with the extra sprinkle of 'Neandrethal AMD marketing'.
Hey goldcakes, your comment has been removed because it is not a trustworthy benchmark website. Consider using another website instead.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/hardware) if you have any questions or concerns.*
Multithreading isn’t a yes/no thing, in a lot of software you will see performance plateau after N cores. If your software doesn’t scale past 16 cores, which is a substantial amount of software, you’d get lower performance since those 16 cores would be slower.
They could do the same thing AMD does. Which is that the cores are not really connected at all and the OS is directed to avoid chip to chip communication. Or they can do like they do in Xeon, which is using high bandwidth connection to directly connect the busses on the dies. I’d say it doesn’t really matter much, they need to improve the soc die a lot anyways.
That is what they do. Chiplet to chiplet connection is through the io die and they don’t really maintain coherency between them. The OS handles them a bit like numa nodes, trying to contain processes within a chiplet.
This is a good strategy because it seriously reduces traffic on the bus, especially long range traffic. But it means the L3 is really split. Each core can directly access cache only their own chiplet.
But the penalty can also be big in some cases when the process is not contained. This is why the single chiplet 8 core chip beats the 2 chiplet 12 core chip in some workloads.
> and they don’t really maintain coherency between them.
"Coherency" has a very specific, precise meaning in this context, and AMD *is* maintaining coherency between CCDs. It's just that the interconnect is slow enough that it isn't practical for one CCD to use the other's L3 cache as an L4 cache.
Yeah I mispoke, I should have said they don’t do snooping between the chiplets but rather have a slower cache directory in the io die. This incurs heavy penalty if the CCDs have to handle same data.
There's no connection between CCDs; everything gets routed through the IO die and is as slow as going to DRAM, which is a serious enough problem that the OS scheduler and memory allocator need to take it into account.
Nova Lake will be exciting because it will be the first CPU to introduce APX instructions which extend the x86-64 GPR's from 16-32, closing the gap with ARM but not matching it due to increased opcode length. It will reduce pressure on load/store units which Intel claims will result in 10% fewer loads and 20% fewer stores and support for APX can be added with simple recomplication. This will be the first time x86-64 GPR's have been extended since AMD introduced 64bit extensions over 20 years ago
The AVX10 standard will also add support for 256bit vectors to the existing AVX-512 standard allowing the P and E cores to share the same ISA compatibility (Arctic Wolf will almost certinly support 256bit vector lengths)
The L3 latency issues that were seen with arrow lake are rumored to be fixed with Nova Lake
Extra registers seem like something that will finally make JIT compilers worth their salt, but I suspect most native apps are still sadly going to be compiled and shipped with SSE2 baseline for the near future, except for more demanding apps that already target AVX2 / by runtime selection.
I am still bummed that they messed up on AVX512 yet again instead of double/quad pumping it like AMD or earlier AVX, purely out of skill issue, but at least we'll get the missing compare instructions and some more of the spicy ones.
It's really because that trying to double pumping AVX-512 would increase E core die area without too much benefit in return and from what i've heard, quad pumping AVX-512 from 128bit vectors would be difficult
If you have E cores you may as well put as much multhreaded work on them as humanly possible while investing more resources in the P cores so that a single instruction stream can be executed on as many ALU's inside the core as possible
If you assume the big cores are 15W each, the E-cores 2W, and LPE cores at 1W, you wend up right around 300W. That's in line with Raptor Lake without power limits.
Bandwidth will be a very interesting topic of discussion with all of these cores. To be honest, I feel like this may be a Sierra Forest-AP (2 x 144 core) situation on a different platform, perhaps regaining the HEDT market from Threadripper. Then again, still exciting that this config could come out to mainstream desktop.
So, if 2E cores are roughly as fast as 1P core, are 2LP cores as fast as 1E core?
If my estimate is roughly correct, this will be like CPU with 33 normal cores.
Spec Int the industry standay benchmark
https://blog.hjc.im/spec-cpu-2017
265K P core 11.1
265K E core 8.94
Also chips and cheese got the same score for Skymont
https://chipsandcheese.com/p/skymont-in-desktop-form-atom-unleashed
The SoC tiles for Meteor/Arrow Lake have a clear differentiation of LP-E cores for mobile but not for desktop. Intel dropping this distinction and including LP-E cores on future desktops would need a pretty strong justification.
Their initial implementation of LP-E cores really didn't work well at all, what with the lack of L3 cache, the horrible latency to DRAM despite the LP-E cluster being right next to the DRAM controller, and poor handling on the software side where multithreaded programs end up spawning too many threads because Windows reports the LP-E cores as part of the total core count but won't actually schedule any work on them.
Going up to 4 LP-E cores should help the system spend more time with the main compute tile(s) powered off, as twice as many cores will be more capable of handling the never-quite-idle background activity of a typical PC. Especially if the LP-E cluster gets a better cache hierarchy. I can easily see it being the right move for the mobile parts. But I'm doubtful that it would be worthwhile for the desktop unless Intel is trying to backtrack on the chiplet-crazy strategy and intends to share the SoC tile between desktop and mainstream mobile.
Hello Optifnolinalgebdirec! Please **double check that this submission is original reporting and is not an unverified rumor or repost** that does not rise to the standards of /r/hardware. If this link is reporting on the work of another site/source or is an unverified rumor, please delete this submission. If this warning is in error, please report this comment and we will remove it.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/hardware) if you have any questions or concerns.*
90 Comments
Reactor-Licker@reddit
Equivalent-Bet-8771@reddit
SherbertExisting3509@reddit
Equivalent-Bet-8771@reddit
Dalcoy_96@reddit
mduell@reddit
kerotomas1@reddit
SherbertExisting3509@reddit
Slyons89@reddit
MixtureBackground612@reddit
Numerlor@reddit
CrzyJek@reddit
EasyRhino75@reddit
constantlymat@reddit
CrzyJek@reddit
mrandish@reddit
Strazdas1@reddit
996forever@reddit
Strazdas1@reddit
goldcakes@reddit
Reactor-Licker@reddit
Maleficent-Salad3197@reddit
Numerlor@reddit
EasyRhino75@reddit
PM_ME_UR_TOSTADAS@reddit
SmashStrider@reddit
CrystalBlueClaw@reddit
Aggressive_Ask89144@reddit
SmashStrider@reddit
deleted_by_reddit@reddit
AutoModerator@reddit
ExtendedDeadline@reddit
Maleficent-Salad3197@reddit
mduell@reddit
PorscheFredAZ@reddit
JobInteresting4164@reddit
Glittering_Power6257@reddit
vegetable__lasagne@reddit
vlakreeh@reddit
jaaval@reddit
hackenclaw@reddit
jaaval@reddit
hackenclaw@reddit
Stennan@reddit
jaaval@reddit
RealThanny@reddit
jaaval@reddit
wtallis@reddit
jaaval@reddit
wtallis@reddit
nanonan@reddit
Head-Letter9921@reddit
bashbang@reddit
jedijackattack1@reddit
logosuwu@reddit
bazhvn@reddit
Admirable-Ad-3374@reddit
Morningst4r@reddit
SherbertExisting3509@reddit
amidescent@reddit
SherbertExisting3509@reddit
lifestealsuck@reddit
majia972547714043@reddit
SherbertExisting3509@reddit
maybeyouwant@reddit
Sopel97@reddit
nhc150@reddit
cimavica_@reddit
Same-Location-2291@reddit
mrandish@reddit
Vb_33@reddit
gvargh@reddit
996forever@reddit
bashbang@reddit
TheAgentOfTheNine@reddit
bashbang@reddit
996forever@reddit
Winter_2017@reddit
Noble00_@reddit
PorscheFredAZ@reddit
Modaphilio@reddit
6950@reddit
Modaphilio@reddit
6950@reddit
Modaphilio@reddit
maybeyouwant@reddit
wtallis@reddit
juGGaKNot4@reddit
Modaphilio@reddit
AutoModerator@reddit