[Tom's Hardware] Intel Xeon 6+ ‘Clearwater Forest’ puts 18A in the data center with up to 288 cores, 576 MB of L3 cache — new Xeon 6990E+ is 30% faster per thread than 192-core AMD Epyc 9965, says Intel
Posted by Noble00_@reddit | hardware | View on Reddit | 32 comments
ElementII5@reddit
Wow, looking at the packaging that looks like a really expensive CPU to make.
Compute chiplets, active base chiplets, I/O chiplets and EMIB chiplets. All stacked on top of each other and then on substrate.
The silicon might not be that expensive for intel, though it is still a lot of silicon area, but packaging never is 100% successful.
If you compare this to chip Intel chose to compare it with, the EPYC 9965, it just uses chips on substrate. Much less convoluted.
To top it off its not even faster.
Rough math:
Intel Xeon 6+ threads x 1.29: 288 x 1.29 = 372
AMD EPYC 9965 threads: = 384
Geddagod@reddit
Very impressive piece of silicon.
Well Intel did throw the packaging team under the bus when they delayed this part lol.
Intel is claiming it is more efficient though.
Toojara@reddit
The unit is kind of misleading, perf/watt/thread. If the absolute performance were the same with given threadcounts and TDP the 6990E+ would be much higher. With reasonable SMT benefit (\~25-30%) both performance and efficiency (performance/watt) should end up more or less even on average.
ElementII5@reddit
Well, sure, but nothing we haven't seen before for some time already. The question is, why is this even necessary if there is not tangible benefit justifying the cost.
Oh, how?
Looking forward to the Phoronix bench!
Geddagod@reddit
I think, after the MI300X, the most complex advanced packaging chip yet from anyone. Plus, this is all internal, so kudos to Intel for that.
DMR does seem to be taking a step back though. Prob for cost reasons.
I mean, yea I agree with you on that lol. They might just need all that packaging to remain competitive due to their architecture not being as good.
I was talking about how expensive CWF is, with some very rough and conservative calculations, on Anandtech a few months ago. Even from purely a design perspective it's more expensive, not accounting for Intel's likely worse yields.
MJ Q4 2024 earnings call:
I'm assuming when she referred to GNR in the second part there, she meant CWF lol.
I think it's pretty clear that they didn't want any delays attributed to 18A issues, as 18A is much more important to Intel than advanced packaging is, realistically. So she made it clear CWF delay was not due to the node but rather the packaging.
Yea, but I'm guessing the lack of AVX-512 is going to kill this part so hard compared to this in specint2017.
Slasher1738@reddit
Intel never understood why AMD's use of chiplets has been so profitable. Same design scales up and down the product stack. Just wait till AMD completes the mobile chiplet stack.
Psyclist80@reddit
And then AMD moved on to its Zen6 @ 2nm and 256/512 cores/threads, leaving Intel in the dust again. Production underway!
Geddagod@reddit
Better than what I was personally expecting. 1.3x specint2017 uplift over Turin Dense sounds good.
Can't wait for phoronix to get their hands on this part.
anders_hansson@reddit
So Clearwater isn't beating Zen 5, and Zen 6 EPYCs are likely going to appear within a year with an alleged performance uplift of 20-30% per core (and with more cores).
SirActionhaHAA@reddit
Errrr it's launching in july which is next month.
anders_hansson@reddit
Looking forward to seeing Phoronix get their hands on it.
Geddagod@reddit
I think it's going to be pretty close overall with Turin Dense, in specint2017 at least. Prob pretty workload dependent too.
I think Venice is going to dog walk this part lol. We will see though. I'm expecting it to launch at AMD's AI event at the end of July.
heylistenman@reddit
Intel continuing their streak of releasing a product that just about meets AMD only to get clobbered again right after.
Geddagod@reddit
Honestly I think a big problem is just Intel not being able to use the latest node in time. Their stuff "iso node" seems somewhat competitive. Though to be fair, part of it could also just be a result of their stuff iso node also just launching much later than when AMD originally launched their stuff (ex: SPR vs Milan).
Geddagod@reddit
Ok wait I'm blind, the claim is 1.29x better perf per thread in specint2017. They tested Turin with SMT on, and Turin has way more threads than Clearwater Forest...
jaaval@reddit
It should result in roughly the same total throughput given the difference in thread count.
The power efficiency comparisons still look good.
SirActionhaHAA@reddit
That don't really matter, turin dense is meant for thread dense workloads aka meta hosting their services. Total throughput ain't the point, the # of threads is. Turin classic has the same perf at fewer cores but with higher power. Different parts with different uses.
Geddagod@reddit
Yes. Rough napkin math based on the figures presented:
turin dense scores \~1450, divide by 384 threads for AMD per thread perf, multiply that by 1.3 for CWF per thread perf, and multiply it out by 288 to get total CWF nT perf of \~1410.
It does. Though the benchmark Intel chose in their Turin comparison does seem to favor CWF a good bit more than specint2017 does, not a big deal.
zzzoom@reddit
Emphasis on "per thread". Turin dense has SMT, Clearwater Forest doesn't.
Geddagod@reddit
Yes, I just caught that yikes.... I take back my initial optimism on the part T-T
Noble00_@reddit (OP)
Finally we are getting CWF after what felt like ages of moving dates around. Intel banchmark claims, 2.26x avg performance and 1.55x perf/watt against previous gen SRF and 1.3x avg performance and perf/watt against Turin-Dense.
Geddagod@reddit
I think they only delayed it once haha. But it was a delay of \~6 months.
Noble00_@reddit (OP)
Really? Feels like CWF has been in the news for years and I remember simply feeling 'whelmed' that it didn't answer Turin soon after
SirActionhaHAA@reddit
Per thread, not avg perf.
Noble00_@reddit (OP)
Yeah, other comments have pointed this out, I'll edit for correctness, thanks
zzzoom@reddit
So their E-cores without SMT are 30% faster than half a Turin Dense core?
ResponsibleJudge3172@reddit
Not half, when has multithread ever double performance?
Geddagod@reddit
This would be a good point if Intel was comparing Clearwater forest without SMT vs Turin dense without SMT. This take would also have CWF as prob being around as fast as Turin Dense, maybe slightly faster, in specint2017 with Turin having SMT on. Which is personally, what I expected, as I said a few months ago based on the numbers Intel presented when they first talked about CWF.
When you look at the footnotes though, Intel claims they tested Turin with SMT turned on, for specint2017. What this means, is much worse. As u/zzzoom claims, this would be then claiming that an E-core is 30% faster than "half" a Turin Dense core.
What this means for total nT perf is terrible. Turin dense scores \~1450. Divide that by 512 threads to get "per thread" Turin perf, multiply that by 1.3 to get CWF per thread perf, and then multiply that by the number of threads CWF has, and you get \~1060. Also for reference, GNR scores \~1230.
Honestly this take is so bleak, and the perf is so unbelievably bad, that I'm convinced that I'm missing something or the footnotes or mislabeled. I would be glad if anyone can point out where I went wrong.
zzzoom@reddit
?
If you doubled the performance of a core with SMT (and you don't, you might get 110%) each thread would run at 100%, not 50%.
heylistenman@reddit
Wait, Darkmont? Somehow I was convinced they were using Skymont. But it makes sense as they already have Darkmont working on 18A.
Pretty solid uplift versus previous gen, more than was predicted. But it’s also a much more complex chip.
r1y4h@reddit
Note about the the article title "Xeon 6990E+ is 30% faster per thread than 192-core AMD Epyc 9965" is misleading. In the same article, Intel slide says both per thread and per watt. So 30% is not the true raw performance faster than AMD's.
AutoModerator@reddit
Hello Noble00_! Please double check that this submission is original reporting and is not an unverified rumor or repost that does not rise to the standards of /r/hardware. If this link is reporting on the work of another site/source or is an unverified rumor, please delete this submission. If this warning is in error, please report this comment and we will remove it.
I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.