Why 10 GHz CPUs are impossible (Probably)
Posted by Forsaken_Arm5698@reddit | hardware | View on Reddit | 215 comments
Posted by Forsaken_Arm5698@reddit | hardware | View on Reddit | 215 comments
DaddaMongo@reddit
There was so much free performance available back in the late 90s early 00s. I was running a 3.4 pentium 4 at 4ghz with mad cooling. I don't know if software development has mitigated the problems of parallel processing but when we started to see the rise of multicore processors it was a major concern.
Forsaken_Arm5698@reddit (OP)
Since then IPC has been a major driver to increasing single-core performance, but even that seems to be hitting diminishing returns these days across all camps (ARM, x86, RISC).
Competitive_Towel811@reddit
I mean CPU performance is mostly a function of memory latency. 95% of a modern CPU is just trying to make up for the fact memory is so much slower than logic.
admalledd@reddit
Right, I don't have the numbers on-hand but the memory of 20-40 years ago was much closer in speeds (in all terms) proportionally than to today's CPU/Memory topologies. My memory (heh) is that SDRAM of the 90s had about 1GB/s on the higher end (Per DIMM? or was it per Bank?)? Since then, we are now at "about" 50GB/s per DDR5 DIMM (specifically common consumer desktop memory, ignoring LPDDR/CAMM2/etc for simplicity). So, thats 20+ years and "only" a 15x, while CPU speeds are wildly more performant even in single core. Using SPECint2006 which only covers a portion of that timeline, starting at scores in the 10s circa ~2006 to scores in the 10s of thousands by ~2017. The gap would be even bigger if we went back to the 90s.
We (developers) are exceedingly hamstrung by the memory wall. Most of the performance gains at a hardware level are "make memory fake-faster" tricks, TLBs, pre-fetch caches, branch prediction to then pre-fetch extended memory references, SIMD to AVX to NEON to RVV etc all to push more towards "full pipe" memory throughput efficiencies, etc. Not even getting into the absolute insanity occurring at low level in software to make things like strings more compact/cheap, etc. JIT compilers to recompile your working code smaller or to remove/inline memory references so they aren't "so far apart"... wild wild times.
If memory was instead commonly 10x faster than it is now, we'd see some wild shit. Most AI compute things are memory throughput constrained as well, and they are just brute forcing it by designing the hardware to have hyper-wide memory busses instead of "tall".
No_Slip_3995@reddit
Tbf not all app developers are hamstrung by the memory wall. There are applications like Cinebench that fit comfortably in a CPU’s L2 cache, which is why performance scales so well even on CPUs with slow RAM and small L3 caches.
HeinigerNZ@reddit
Holy shit. I never knew this.
Competitive_Towel811@reddit
Yeah, that's why GPUs can have so much higher throughput by cutting out all that extra stuff and just focusing on doing the most math possible on the specific workloads where latency isn't a constraint and instead bandwidth is.
HeinigerNZ@reddit
And I guess that if they had a way to make memory a lot faster they would have done so already. Are there any ideas/technologies on the horizon to improve this, or are we stuck with this situation?
jmlinden7@reddit
The speed in question is latency not bandwidth/throughput.
admalledd@reddit
Realistically, "big" L2/L3, on-die unified memory, hyper-wide memory buses, etc all allow enough that cutting latency significantly is less important than the lack of width. Would I take a 10x improvement bringing memory to the 1-2ns latency? shit yea I would, but if I had to choose between 10x bandwidth or 10x latency? I would choose bandwidth and still ask for more. I semi-regularly write programs where I am memory bandwidth constrained, CPU designs and modern programming techniques make dealing with latency far more tolerable than in the past. Yea, still sucks, but bringing far-memory latency from 10-15ns down to 1-2ns would change less than you'd think besides greatly reducing the need for L3.
jmlinden7@reddit
The vast majority of CPU workloads are latency constrained and not bandwidth constrained. You have understand that people use their CPUs to scroll instagram and swap between 200 tabs in Chrome.
admalledd@reddit
Most so called latency constrained programs, with respect as someone whose job it is to care, are in generally two camps: (1) programs whose compute performance is not a metric they even measuring for or (2) written like shit.
Nearly any/all web-app based programs are exceedingly badly written, and the few that try to be well made have higher project priorities like collecting every byte of data they can on you to profile for ads/sell.
Tell the developers of these latency constrained programs to get with the picture of the past 20+ years and learn to use multiple cores/dispatch. Ah right, web/JS is still and likely forever to be single threaded. Its not like we have other paradigms we could use, nooo...
jmlinden7@reddit
The vast majority of users use badly written webapps.
admalledd@reddit
Then they should pressure the vendors of those to fix their shit.
jmlinden7@reddit
Why would they do that when pressuring CPU manufacturers to find workarounds for latency issues (branch prediction, etc) is more effective?
admalledd@reddit
Which is why I mention government regulation. We already have energy efficiency regulations for PSUs for example. Regulation could reasonably slap the stupid out of software vendors being inefficient.
jmlinden7@reddit
CPU manufacturers can pump out better branch prediction algorithms faster than the government can pump out regulation. Consumers are not going to go for the slower option.
admalledd@reddit
Are you aware of the concept that Government is for and by the people? And that regulations can be pro-active and far reaching? We've known web apps were this impracticable since the mid 2000s, over twenty years of chances. Various flavors of regulatory capture and people forgetting/apathy of what governments are for doesn't mean you should just give up too. Stop making excuses without any fight at all.
jmlinden7@reddit
The US is generally anti-regulation and it seems unlikely that voters will want the government to micromanage what type of software you are allowed to release. The entire point of the IBM-model of personal computing is to allow anyone to release any software.
admalledd@reddit
There is a specific reason why I cited the energy efficiency regulation you know? Because it shows a clear exact path to answering your exact concern about micromanagement? Read up on that, cite it and let me know how a similar law for software would still be stifling to software innovation.
Wait_for_BM@reddit
The basic 1 transistor DRAM cell hasn't changed, so memory latency hasn't and won't gone off anywhere near an order of magnitude improvement. Can't do much to improve speeds. SRAM can go faster, but at 6/8 transistors they don't scale well power or density.
What you are seeing in bandwidth improvement is due to sub-dividing the large memory array into smaller logical blocks, multiple memory banks to keep active, pipeline reading a line of memory at a time and hiding part of the write cycle in pipeline. All of these are done in synchronous logic around the old analog DRAM cell.
Don't expect any major improvement any time soon. Past improvement does not imply future performance.
admalledd@reddit
Oh I am well aware of DRAM's limitations and where it has gotten its improvements from and how unlikely we are to see any advances.
I just deeply wish there were a sudden 10x+ leap once more for memory, but it is highly unlikely.
goldcakes@reddit
Think of it the other way; memory (esp latency) reached maturity and has came close to fundamental physics laws far earlier than logic.
hackenclaw@reddit
lets not forget, 512kb L2 cache per core dated back as far as Pentium 2, AMD ryzen still stuck at 1MB only.
Sure we have L3, but I dont think the amount of cache is enough to make up for so much more CPU performance we gained since kabby lake 7700K.
admalledd@reddit
To memory latency: that hasn't scaled at all. In the 1990s SDRAM was "about 10-15 nanoseconds, with some kits able to be clocked to reach 8ns". Today's DRAMs (be it HBM, DDR, whatever off-die) are still due to physics within that 6-12ns range. It is exceedingly difficult to get any faster than about three nanoseconds each way due to speed of light and electron saturation requirements.
To cache: increasing cache is exceedingly difficult due to how interconnected it must be for each memory line, the Associativity of the cache.
airmantharp@reddit
DRAM has been at 50ns to 150ns for thirty years…
Strazdas1@reddit
IPC only became primary driver since frequency scaling became impossible. if we could continue scale frequency IPC would have matttered far less.
InflammableAccount@reddit
And the aftermarket cooling products sucked back then compared to now. That is to say, even a cheap $25 single tower cooler wipes the floor with anything made before 2004.
DaddaMongo@reddit
I was running phase change refrigerator compressors so you are wrong.
Strazdas1@reddit
salvaged from an actual fridge?
InflammableAccount@reddit
Fair, fair. But I was referring to aftermarket cooling products. Products made for PC cooling.
You used parts that weren't originally made to cool a CPU.
DaddaMongo@reddit
actually they were, back then there were a couple of companies selling this gear for pcs along with lots of water cooling companies. Here's some info about one such product
https://www.asetek.com/company/about-asetek/asetek-heritage-technology/vapochill/
feanor512@reddit
Don't forget the Prometeia Mach 2.
InflammableAccount@reddit
Holy balls of fire, I completely forgot about the VapoChill.
I'm not surprised that I forgot about it. Never saw one in person, and only ever read about it. The fact that it costed about $1000 in today's dollars might be why I didn't pay more attention.
But hell yeah dude, how was it? How long did it last and how many systems did you run in it?
DaddaMongo@reddit
I had the later standalone ones vapochill ls, had one on the cpu and one modified to fit my ati radeon. ran it until quad core became the norm but like all pc equipment there comes a point when you have to retire the tech. I also ran a water chiller for a while but things move on.
theholylancer@reddit
wasnt it because sub-ambient cooling took like exponential power consumption to cool the heat
like sub-amb for say 50w vs 100w vs 250w is nuts, and if you want to apply that to a 600w 5090 then you better have like extra power circuits because you need one for the computer, and another for the cooling system... or I guess 240V
Plank_With_A_Nail_In@reddit
You know what you were doing wasn't common right? 99.9999% of PC enthusiasts use off the shelf consumer cooling solutions.
Lol reddit is weird.
RandoCommentGuy@reddit
My core i7 920, was 2.66ghz, was able to push it to 4ghz, was even playing VR with that chip on my HTC Vive even though it was multiple generations older than the minimum requirements.
fordry@reddit
That original x58 platform was such a beast. The 6 core cpus are certainly not top of the line but absolutely still adequate for a lot of stuff still, 15 years later.
Impeesa_@reddit
I just retired mine for the third time within the last month. After doing 7 years as my primary desktop, I dragged it back out to refurbish it with an X5680 (originally a 930 at 4.0 GHz) and doubled up the RAM to 12 gigs. Dirt cheap upgrade by then, and it served for a little while as a home server, and then again as my kid's first computer. It was getting a little cranky about some things, but I heard no complaints about the Minecraft performance. If I want to drag it back out again for something else, it's still good to go.
RandoCommentGuy@reddit
Yup, around 2016 I switched from the i7 920 to a xeon x5650 for $50 and used it for another 3 years with VR gaming, and it's still running, I have Ubuntu on it just to mess around with, but it's still runs great.
derangedsweetheart@reddit
Had 990x on R3E clocked to 4.5Ghz on air.
Had some awesome micron 1600Mhz that easily clocked 2133Mhz on stock voltage and tiny overvolt made them run 2400Mhz.
RandoCommentGuy@reddit
damn, nice, i think i just stuck with 1600mhz, its OCZ gold 1600mz, i have 6 sticks in the build, maybe ill try upping their speed, they might use micron in them.
Kougar@reddit
Even up till 2006. At1.86Ghz the E6300 was running circles around the Pentium D clocked 3.4Ghz despite it being a full 1.5Ghz faster, especially in games too. Which made it all the more incredible when the E6300 could handle a mild 100% overclock to 3.8Ghz and run 24/7 stable without exotic cooling as long as the motherboard could run a high enough FSB. Then you'd have all the benefits of high clocks combined with high IPC. Those were the fun days!
hackenclaw@reddit
remember, the original Pentium to Pentium 3 had only 25w tdp.
The original Athlon/AthlonXP which getting slam for "high power consumption" had only 40=75w Tdp.
Pentium D (a dual chiplet Pentium 4) where said to be power hog. (it was rated 95w-130w)
you can even see this from GPU, the high powered Radeon 9700 pro is 40w, Fermi GTX480 is 250w, now we got 5090 taking 600w.
So we werent getting performance from just shrinking transistor; we are trading it with higher power.
RuinousRubric@reddit
Dennard scaling died. Back in ye olden times, cutting transistor area in half cut power in half too, so the extra transistors were free as far as power draw was concerned. That stopped being true around 20 years ago. Each new node started increasing transistor density more than they reduced transistor power draw, so the power density of chips started increasing. All else being equal, the only way to keep the same power draw after a modern node shrink is to make the chip smaller or clock it lower relative to its maximum. Don't do those, power inevitably goes up.
They still care just as much about efficiency, mind you, and the new nodes are still more efficient. Just not by as much as they used to be.
Also, I don't think it's actually bad for power draws to get as high as they have been. You could still scale up to really high power draws in the old days, you just did it by adding GPUs with SLI or crossfire. And the scaling with those tended to be atrocious. Doubling GPUs got you like 50% more performance in games if you were lucky, 0% if you weren't. Going from a single smaller GPU to one that's double the size (and thus draw) can be expected to get nearly double the performance in most everything as long as you don't run into a bottleneck elsewhere in the system.
Quealdlor@reddit
We can still have 10x PPC of P4 with 2x frequency, 16C/32T and lots of new instructions after 2005.
No_Slip_3995@reddit
The highest clocked Pentium 4 was 3.8Ghz, it’s 2026 and we are nowhere near doubling that clock speed without liquid nitrogen.
ZappySnap@reddit
That’s because the original Pentium had a rather small, chip sized passive heatsink. By the time we went to the Athlon XP series, now it was best to have a heavy fully copper heatsink with active cooling.
What we use today for cooling would have seemed absolutely insane then.
I had one of these bad boys on my AthlonXP: https://www.overclockers.com/wp-content/uploads/images/stories/articles/Thermalright_SK7/sk71.jpg
Strazdas1@reddit
The stock coolers that you get packaged with CPUs now look pretty similar to what i used for my AthlonXP. It was the GPU that was the worst offender when it came to heat on that setup. The only GPU i actually burned up (back then safety throttling wasnt a thing).
obiwansotti@reddit
I went pin fin:
https://www.overclockers.com/wp-content/uploads/images/stories/articles/Swiftech_MCX462_/swif1.jpg
R-ten-K@reddit
There is more to computing platform that low end PCs.
Even in the 80s, high end CPUs implemented using ECL were dissipating hundreds of watts, and required rather "exotic" cooling solutions.
A lot of the current water cooling and liquid immersion techniques, for example, are derived from 70s/80s systems.
Jerithil@reddit
Always interesting seeing the old 80s/early 90s supercomputers and mainframes and all the exotic stuff they had to do to cool them.
Plank_With_A_Nail_In@reddit
r/hardware only really understands gaming hardware.
toddestan@reddit
Originally, you didn't even have a CPU cooler. I don't remember seeing many coolers before the 486 came out. Yes, you just had a bare CPU in your PC, and it worked just fine.
Part of the whole slot thing that happened with the Pentium II and contemporary Athlons was partly due to concerns that new CPUs would start getting so hot that we'd need to cool them from both sides. Fortunately it turned out that wouldn't be necessary, mostly because we started using heat pipes in CPU coolers.
One thing modern PCs do very well is idle power. My Athlon XP PC idled at like 150 W and was something like 180 W at full load at the wall. My i9 idles at something like 90 W, but on the other hand full load is over 600 W.
xole@reddit
That reminds me of a friend who put sockets (like for wrenches) on his AMD 40MHz 386 to cool it for overclocking. Was it safe? No. Did it work? Yes.
mediandude@reddit
Pentium-75 had a passive heatsink.
Ryzen APUs can be run with a passive heatsink.
Culbrelai@reddit
My 32 core threadripper idles at like 40w lmao
railven@reddit
The 486 DX4/100mhz definitely benefitted from a passive heatsink, but the packaging didn't include it.
~10 year old me trying to figure out why my PC was crashing during the summer on a brand new upgrade I barely understood what was what. I learned how to flip jumpers that summer :D
Plank_With_A_Nail_In@reddit
It didn't work just fine, overheating on PC's was a real issue for real work, most home users didn't see it because they just played minesweeper and word their CPU was mostly idle.
theholylancer@reddit
you know that we are talking like 1980s right, the era of dos and not even windows 1 was out fully
R-ten-K@reddit
FWIW most of the push for the slot late 90s era was to have cache on package, and to screw over x86 clone vendors by kicking them off the socket/chipset for that intel generation.
ZappySnap@reddit
Yeah my 386/33 was a bare chip.
airmantharp@reddit
DX40 gang here!
ZappySnap@reddit
Lucky. Mine didn’t even have the coprocessor. Just an SX.
airmantharp@reddit
I had to make sure, I couldn’t play X-Wing without it!
rddman@reddit
Rather it had a small heatsink because its power draw was modest.
ZappySnap@reddit
Yes…obviously. Brain went too fast and inserted because instead of ‘why’.
tarmacjd@reddit
It’s not because of the heat sink, the heat sink is because of the power usage
DesperateAdvantage76@reddit
On the plus side, 1800W is the limit for typical American circuits (120V 15A), so they can't keep raising the power usage without killing their market.
reallynotnick@reddit
I feel like in the desktop space they are just clocking everything to the limits to score high in benchmarks when everything would be much more efficient at a lower frequency. No one seems to care about performance per watt unless it’s battery powered.
sir_sri@reddit
And in the data centre.
Data centres are mostly power limited, that's pretty much always been the case. You have a facility that can handle some number of kilowatts or megawatts, and you want the best performance you can get for whatever workload you have (database transactions, number crunching, etc.) per second.
Under clocking a CPU to have it hit peak perf/watt can be done at home but it not really worth the headache, oh sure, you can save 10% perf/watt to reduce peak power usage by 5% type thing... but why? Most of the time your CPU isn't at 100% load anyway. I'm pretty sure you can do the same thing with a GPU but both CPUs and GPUs can turn themselves down if they're not needed anyway so again, doesn't seem like a very worthwhile use case. Whether you're using 800 or 850 watts or whatever is not going to break the bank on the electricity bill, doubly so if you're somewhere you have to pay to heat the house, since after all a computer is just an electric resistive heater that's slightly inefficient because it does math to do the conversion.
FinancialRip2008@reddit
not to argue you point, but this doesn't match my experience. both my zen2 and especially 12th gen desktops (not the newest, but it's what i own) will shovel power unless asked not to. turning down the power has a relatively small impact on performance compared to power consumption, at least until you're down at laptop levels of power. that's for gaming, but also for all-core benchmark stuff that doesn't match what i do with a computer.
with gpus they usually scale better with power, but i've also seen 10 year old games that will scarf down >300w for basically no reason unless you cap power and/or framerate or whatever you want. i also had an rx590 that performed better with a power and voltage limit.
otoh, i was running a 3700x/x570/6900xt with dual monitors for a while. the 3700x was horrible at idle, the x570 was horrible at idle, and the 6900xt had the vram 100% spooled constantly with 2 monitors and was horrible at idle. it idled at like 120w. it was just a bad build for a machine that ran 24/7, but i didn't know better when i built it.
VenditatioDelendaEst@reddit
That is because the power limit is a ceiling, and most workloads are lightly threaded enough, and spend enough time idle, that they won't hit it unless you crank it down quite a ways from stock.
But in that case it's not making an appreciable difference to actual power consumption either, so.
Plank_With_A_Nail_In@reddit
Did you fiddle with your power settings? Your experience might not be the same as everyone else's if you turned the slider to "Max performance".
thewafflecollective@reddit
In my experience it's closer to saving 30% power for 10% less performance. Performance scaling gets really bad for those last few % performance, but manufacturers do it anyway to make their product appear more competitive
sir_sri@reddit
Is that 30% overall power draw, or 30% peak power draw. Those are different things.
Even there, ok, you save maybe 100 watts average power draw, that's costing you literally 1.8 cent per hour in electricity in the US, maybe 2.5 to 3 euro cents in europe. So how many hours a year are you using your computer at a load where that matters? 2000 hours a year?
That's where the data centre at 8760 hour a year of run time of course matter, 30% power savings for 10% performance would add up a lot, and let's you squeeze in more hardware for the power budget.
soggybiscuit93@reddit
That's 100 extra watts of heat in your room.
sir_sri@reddit
Well house.
It is like two lightbulbs from 15 years ago. Not a big deal.
Again, obviously, in the data centre, that's more heat you need a way dissipate. But for home users, a lot of the day it's just heat which you need anyway, and if you need air conditioning, this isn't any different than a couple of 60 watt light bulbs a few years ago.
It's not 0, but again we are talking single digit cents per hour that it's costing to run this, even with cooling it.
thewafflecollective@reddit
Overall power draw when gaming. Yes, RTX 3000 series was that notoriously power inefficient by default. I personally got a performance delta of about 5% with a custom volt/frequency curve with a -30% power limit, but it depends on your workload and how lucky you are with silicon quality.
And it's mainly for heat and noise management, not saving money. 3080 FE cards had pretty bad cooling and the heat gets pretty bad during summer, but you took what you got during the COVID era.
Jeep-Eep@reddit
I mean, I'd like some uplift in GPU perf per watt again just so we can have the cooling solutions merely 'loud chungus' instead of 'Oh lawd he coming, I can hear it!'
frostygrin@reddit
Heat and noise while you're next to the computer. 10% can be a lot when it's near the limit of the cooler's capabilities.
Hour_Firefighter_707@reddit
Exactly. No one except Apple, and that means that single core performance of x86 laptops is a lot worse than both MacBooks and desktops.
As a current Windows laptop user, it really sucks
lord_lableigh@reddit
Intel 3 series says hi.
That is changing though. I hope intel can claw back their profit and convince that panther lake is the right direct going forward for mobile devices. Those sre shockingly battery efficient even besting m5 at times. Also I cannot wait for this new era of large integrated graphics dies to get 5060s level of performance to become common.
Hour_Firefighter_707@reddit
Panther Lake isn't fast though. The fastest Panther Lake CPUs are just about M5 base model in multi-core performance, but that's not the problem.
The problem is that they have about the same single core performance as Raptor Lake, just at lower power draw. PTL is slower per thread than M3. Zen 5 is slower than M3. Even on the desktop.
Don't get me wrong. Intel has made big improvements to efficiency. Their performance through it all has stayed the same though
Quealdlor@reddit
I am too surprised both both Intel's and AMD's performance per clock isn't higher already. At least 25%. This will change with Zen 6 and Nova Lake I presume.
lord_lableigh@reddit
Agreed. But I'd rather take that efficiency anyday over "more juice"-d p cores. We've been seeing much better improvement in e cores for the last few generations. I don't know where they're going with this but I'm sure as hell curious to see what they could do in 2 -3 gens with hopefully the latest intel node then.
airmantharp@reddit
Intel is still treading water trying to get their fabs back in line - there really should be no doubt whether they can design faster cores.
The question for the last decade has been whether they can reliably manufacture them…
mediandude@reddit
But there should be much doubt whether Intel can design faster cores without cutting Spectre and Meltdown corners.
airmantharp@reddit
Lol, they cut those corners because they didn’t expect to be using those cores for more than one generation.
ComplexEntertainer13@reddit
They are designing a faster core, people are just looking in the wrong place.
The P cores are in maintenance mode frankly. The real action is in the E core department.
Educational-Web31@reddit
Yes, but at what cost? The P-core team has fallen out of their saddle in recent years.
Hour_Firefighter_707@reddit
Arrow Lake was on N3B. It isn't really any faster than Raptor or Meteor Lake
airmantharp@reddit
That’s kind of the point though - they’re not putting their transistor budget into making faster cores because they’re having to pay TSMC. And they don’t really need faster cores, they need more efficient cores right now.
If their fabs weren’t sucking wind, they’d be doing both - faster and more efficient.
Educational-Web31@reddit
and according to rumours, Nova Lake will have M4 level ST, while Apple has M6 out.
AMD might have M5 level ST.
RumbleversePlayer@reddit
Isn't PTL single core similar to ARL, not raptor lake?
Educational-Web31@reddit
Intel Core Series 3 you mean? coz Intel 3 is a node name.
lord_lableigh@reddit
Obviously, yes. That's why the word series exists right after that.
Forsaken_Arm5698@reddit (OP)
And Snapdragon X, with several asterisks.
Educational-Web31@reddit
what asterisks?
Forsaken_Arm5698@reddit (OP)
Only some SKUs have Boost, and hence get the full fledged ST performance.
onolide@reddit
Apple does, if you can consider Mac desktops(e.g. Mac mini, Mac Studio). While the Mac Pro idles at 40W+, the Mac Studio(which also contains the M Ultra SoCs) idles at ~10W, which is crazy for a chiplet based SoC that has that many cores(and no battery, so less power considerations).
virtualmnemonic@reddit
Modern CPUs already have good power efficiency under minimal load, and they spend the vast majority of their time under said load. Realistically, people aren't slamming their CPUs all day, so the larger power draws under infrequent loads is worth the trade off.
Jeep-Eep@reddit
And... well, even the the Hyper 212 3DHP can keep a dual die AM5 CPU under control, and if all else fails, you can land an LFIII for a reasonable price most places to tame the blighte, not to mention there's no connector problems; high top power on CPUs isn't that much of a pain in client.
GPUs are a rather different kettle of fish.
shawnkfox@reddit
Google, Amazon, etc care a lot about efficiency for their data centers. For personal computers people don't care as much, but they probably should.
theholylancer@reddit
i mean.. m chips says hi
ppl do care, just not for desktops and only mobile really, but its a huge segment of the market
vemundveien@reddit
I have no idea why anyone in the desktop space would care about watt per performance in these days though. If they have a light workload a mid-range CPU or a laptop will do the trick, and performance per power is definitely a thing there. A high end CPU for the desktop market should target best performance possible because the customer for that CPU is someone who cares about performance more than they care about power consumption.
R-ten-K@reddit
... other than most of industry and academia. You are right, nobody cares.
IguassuIronman@reddit
Why would they? Power isn't necessarily cheap but if you can afford a high end, high power GPU/CPU you can also afford the electricity to power it and most people will care more about maximizing what they're getting then making sure it's the most efficient
VastTension6022@reddit
I mean, for a single core, 40-75 was pretty bad.
GrimGrump@reddit
Just to put that into perspective for people, that would mean a 10700k would be running at 300w with no hyperthreading with the same wattage ratio. That's about the same tdp as a 13900k if you push it hard.
exomachina@reddit
Higher Power Consumption isn't a bad thing. We can generate more power. We should be focusing on removing the barriers for generating power. We have so much land available.
shanghailoz@reddit
Power=heat. We already have planetary warming issues.
exomachina@reddit
Oh so we can't have faster hardware because the sun has been blasting us with solar radiation 24/7 for the last 4 billion years?
xylopyrography@reddit
M5 is pretty much the counterpoint to this.
Sure power scaling isn't super great, but a 26 W chip has ridiculously more power than anything from 10 years ago, 900 W or not.
puffz0r@reddit
The M5 doesn't operate at high frequencies though? It's 1+ghz behind amd and intel which is why it can be so efficient
HulksInvinciblePants@reddit
It still stop performance per watt. A lot can be contributed to the node and Arm64’s focus on efficiency, but the outcome is still the outcome.
WingedGundark@reddit
It is also due to the simple fact that in simple terms increasing power to improve clock speed yields in diminishing returns.
This was very clear with the Netburst for example where Intel pushed the power to reach ever higher frequencies, but every increase in clock speed meant exponentially rising power curve. That architecture capped at 3.8GHz and there probably wasn't much left without risking the silicon at least.
There were also architecture changes on way from Wilamette to Northwood and Prescott etc, but higher clocks were always at the core of the design and lots of things were sacrificed for it. As far as it comes to efficiency, it is smart to not to push the highest possible clocks. Of course M-chips have other means which more than make up for the clocks. Apple can design and manufacture these expensive chips because it is not in the business of selling CPUs, but integrating them to their products where they are just one part of the overall cost.
Educational-Web31@reddit
which is why I am not so enthusiastic about the upcoming 5ghz phones.
https://www.notebookcheck.net/Snapdragon-8-Elite-Gen-6-Pro-tipped-to-hit-at-least-5GHz-clock-speed.1212148.0.html
WingedGundark@reddit
Yeah. It is simply physics that higher clocks equals increased power consumption: all those transistors don't change state with magic, it needs electricity. And more you change states, the more you consume 🙂
Educational-Web31@reddit
less to the node, and more to the microarchitecture.
puffz0r@reddit
I mean yeah, they have the best microarchitecture team on the planet but the fact remains that they use the most advanced node and run their chips low on the vf curve so it's very power efficient.
xylopyrography@reddit
Frequency is irrelevant. Performance is what matters , and for 26 W it's a performance, especially single-thread especially monster
LostInChrome@reddit
I mean did you watch the video? We did very much get increased performance with negligible difference in power up until the mid 2000s. The whole reason that we started getting dual cpus in the first place was that we hit a physics breakpoint where previously-irrelevant issues become increasingly problematic and Dennard scaling broke down.
185EDRIVER@reddit
Yes but the compute per watt is going up we're just denisfying the living shit out of it
ComplexEntertainer13@reddit
It was only a single core though. With today's standards that is actually a rather high power budget in a MT scenario. You only really see similar power consumption from a single core in low thread scenarios in most devices or if you run unlocked power/overclocking on desktop.
25W per big core on a 14900K is 200W already.
PoL0@reddit
current CPUs are clocked higher too. and I also read somewhere that a current CPU on pentium tech would use (in the order of) thousands of watts. the improvements in performance per watt are huge.
NerdProcrastinating@reddit
* with silicon based transistors
III-V@reddit
Yes. So long as the industry stays in silicon land, we'll be stuck here.
dingo_xd@reddit
But is there any realistic alternative to silicon?🤔
Strazdas1@reddit
Sort of. Glass substrate has shown some really promising results. If only we can make it economical.
MrMPFR@reddit
I thought that was only for interposers and substrates.
Are there any chips R&D prototypes using this to push clk?
Strazdas1@reddit
I think Intel had some lab-level prototyles that promise clock increases before they decided to stop that research to cut costs :D
The theory is that better substrates and interposers as well as better thermal managing from it will lead to higher clocks. But so far we have no products with that of course.
MrMPFR@reddit
That's a shame.
Then I hope that additional overhead boosts performance enough to offsets the cuts to the HW to hit iso-package cost.
III-V@reddit
Technically? Yes. Economically? Not right now. The wind is slowly blowing that way, though.
JuanElMinero@reddit
This article from SemiEngineering gives a good overview of the current situation.
Quick summary:
A global replacement for all silicon based ICs? Not anytime soon, but some classic Si applications are gradually replaced.
GaN replaced a lot of Si in power ICs for consumer electronics.
SiC replaced a lot of Si in high-voltage applications e.g. EVs and public transport.
2D materials/TMDs show a lot of potential for optical and wireless applications.
In 2025, a bismuth-based 2D material showed superior switching speeds vs. silicon in a cutting edge experimental node. As always, manufacturing at scale and integration into current fabs are the real challenges.
Educational-Web31@reddit
This is why I laugh at the Zen6 7GHz rumours.
Believers cite Zen3 (4.9 GHz) -> Zen4 (5.7 GHz), which is a +16% clock boost from one node jump (7nm -> 5nm). Since Zen5 -> Zen6 is a double node jump (5nm -> 3nm -> 2nm), they believe a bigger clock bump is possible (20%+). Of course, they are forgetting the power wall!
Strazdas1@reddit
I remmeber when the "power wall" was 3.4 ghz.
Exist50@reddit
There's no hard wall, especially when comparing between nodes. And Zen 4 was a very efficient core despite the clock speed increase.
InflammableAccount@reddit
I also could be that AMD is about to start pulling some Intel-inspired Tau PL1/PL2 boost behavior. Just pulling that out of my ass as baseless speculation.
I direly hope they fix the heat-transference issue that Zen4's thicker IHS introduced. This would ALSO help with boosting higher in a given scenario.
kyralfie@reddit
AMD could make vapor chamer IHS if push comes to shove. I think I saw a prototype of one in one of their FAB engineering tours.
InflammableAccount@reddit
... Huh. That sounds absolutely fascinating. Was this a personal tour or a tour video you watched?
kyralfie@reddit
Yeah, it's that one.
InflammableAccount@reddit
It's so weird... rewatching the video the thermal engineering fellow says they saw a "6c thermal benefit" in an "all core workload" with a 7950X delidded. Direct die cooling.
That's so far off of what der8auer or random users got by going direct die that it confuses me. Most see a 15-20c decrease at least.
airmantharp@reddit
It’s not a problem in the real world.
InflammableAccount@reddit
I assume you're referring to the IHS.
When talking about getting the most out of a CPU, like with PBO enabled, it certainly can be the limiting factor.
airmantharp@reddit
If you’re pushing the limits, then you’re well beyond the efficiency curve. You can do it, and there’s probably a few percentage points of improvement left on the table with a good sample, but you’ll pay for it in heat and potential instability today or further down the line.
InflammableAccount@reddit
All true.
And to every person using an AM5 CPU that isn't an enthusiast, nearly irrelevant. However, enthusiast land is where we are currently talking. And that 3-9% performance boost is something a lot of us appreciate.
The smart move is to move the PPT a nice middle ground.
Mina_Sora@reddit
The unreleased Pentium 7 did 7Ghz on 20nm or something.
Zen 6 7Ghz is 100% possible with the power savings and increased transistor budget for clocking higher from jumping 2nm.
Kougar@reddit
None of that is remotely accurate. Cedar Mill was overclocked to 7, even 8Ghz using liquid nitrogen, and that was a 65nm P4. There were never any samples of Tejas in the wild, certainly not at 45nm or even 22nm either. For air cooling P4's didn't tend to go past 5Ghz with stability which is the metric you should be using.
At most Zen 6 might boost to 6Ghz, but even that isn't going to be a guarantee especially the X3D parts. AMD bins its more power efficient chiplets into EPYC parts first anyway.
puffz0r@reddit
4nm->2nm should be around a +25% perf at iso power uplift, zen5 is at 5.7ghz now. The idea that you can only get 300mhz out of that node jump is frankly just as unbelievable as 7ghz.
ResponsibleJudge3172@reddit
That's if you port Zen 5 as is with physdes improvements. Most IPC designs require more power because you are movingore data
Qesa@reddit
The figures TSMC quotes for improved performance or power are at a particular part of the V/f curve chosen for having the largest improvement. Typically nowhere near the maximum frequency.
puffz0r@reddit
Nah, Zen 3 to Zen 4 was a ~15% speed increase, which is what TSMC advertised from 7nm to 5nm. Zen 4 used more power but it also had 60% more transistor count.
Qesa@reddit
If only that was generally true rather than a single cherry picked example that was paired with architecture improvements. What about the node shrink before it? Glofo 12nm was known to be slower than TSMC 16nm, and they claimed 7nm was 40% faster than 16nm. But Zen 2 only clocked about 10% faster than Zen+. Or on the GPU side, same 7 and 5nm, RDNA3 clocked no higher than RDNA2 (likewise the 7nm 7600XT clocked the same as its 5nm siblings)
Kougar@reddit
But that's not the only constant that was changed. Now factor in the 50% increase in core counts that have to sit within the same power budget.
AMD is already clocking outside the power efficiency curve with Zen 5, particularly with the 9850X3D. It's not a question of if they can hit 6Ghz, it's a question of does it still make sense to do so at the given power budget in conjunction with the increase core count and any increased cache sizes, factoring in whatever the average yields AMD are getting with the new node.
puffz0r@reddit
oh, right. I forgot they're supposed to go up to 12 core CCDs. I honestly think they'll push to 150w, tbh I find 6.2-6.4ghz pretty likely achievable with 2 full node jumps.
Kougar@reddit
I would disagree, at that point you need to factor in the thermal density problems node shrinks cause in conjunction with clocks. A substantial portion of modern silicon is 'dark' just to buffer out the hotspots, but as various hotspots get closer together it creates more problems and therefore more dark silicon. Anandtech had some really great explainers on this, RIP Anandtech. That being said I'm not sure how much dark silicon is in Zen 4, if it was mentioned in video interviews it's slipped my mind. But it quickly becomes more economical to keep the clocks below whatever the value is where clocks will create localized hotspots of instability that would require significant dark silicon to mitigate.
Noreng@reddit
45nm was a huge clock speed jump, with 32nm improving further still, so it's possible that a hypothetical 32nm Pentium 4 could have clocked to 6 GHz or higher on ambient cooling. A 6 GHz Pentium 4 would still not have been remotely competitive with Core 2, let alone Nehalem.
As for Zen 6 clock speeds. I am expecting an improvement. Part of the reason why we're seeing clock speed increases again is because the improvements to density are slowing down, which means an increased reliance on clock speed. See how Zen 5 clocks slightly higher than Zen 4, while Zen 3 is significantly slower.
Kougar@reddit
Aye, but that was the tradeoff they made. They chose a uArch design layout that prioritized clockspeed above IPC. I'd even agree with you, theoretically it should have no problem hitting 6Ghz at the smaller nodes. But ultimately clockspeeds are a poor substitute for better IPC. Intel has face planted itself against the clockspeed wall twice now, first with Netburst and then a second time with Raptor Lake. If Intel hadn't been juicing the voltages with a half-dozen types of overlapping boosting behaviors to try and eke out every last hertz the degradation issues wouldn't have happened. Alder Lake had absolutely none of these problems.
I do have to wonder what comes next though, ultimately Zen is still just a uArch family. Intel and AMD both will have to figure out new massively overhauled designs at some point and I'm wondering how many more times that can happen given the constraints of x86 itself. FRED seems like the sort of thing needed to throw off some of that deprecated legacy baggage but I'm not an engineer to really know how many other things similar to FRED are possible with x86.
As per the clockspeed improvement it seems a safe bet. Just remember there's now 50% more cores using the same power budget, plus any additional cache size increases. I'd gladly take the same clocks with a dozen cores to be honest. Whatever AMD does for the sorely overdue IOD overhaul will also affect the general power and thermal budgets though, too.
puffz0r@reddit
Zen 5 also didn't have a full node shrink from zen 4, n4 is basically a better n5
reallynotnick@reddit
I believe in 2004-2005 it would have been possibly 90nm, 2012 was 22nm.
Jumpy_Cauliflower410@reddit
Clock speed isn't limited solely by power but design. Zen 3 to 4 clock speed bump was design improvements. They could do the same for Zen 6. TSMC is suggesting their 2nm design can do +10% max frequency over previous nodes.
ResponsibleJudge3172@reddit
They said AMD using HD libraries is increasing clocks, increasing IPC and increasing core count at the same die size and TDP
ResponsibleJudge3172@reddit
It's OK for clocks to go up, but people ALWAYS believe AMD are miracle workers and others are incompetent.
In the sane TDP, AMD is going to clock near 7ghz and increase core count by 50%? And increasing IPC, ALL AT THE SAME time? And the die size is still same as Zen 5. That's literally what the leaker are saying. Damn me if they are right
Intel can only double core count using more than double TDP. Alps, using double TDP to make the same 2 CCD design AMD uses.
Noreng@reddit
While 7 GHz sounds a bit optimistic, I wouldn't be surprised if Zen 6 can do 6.5 GHz. The reason AMD and Intel are now cranking up the clock speed is because the density improvements are drying up, so cranking clock speed is the way forward.
The video doesn't even explain its own title.
SirMaster@reddit
Isn’t cranking clock speed what Intel did in the Pentium 4 days?
Or didn’t seem to work out too well and better architecture won out over chasing higher clock speed.
airmantharp@reddit
Pentium IV did exactly what it was designed to do - but the target usecases changed.
Quealdlor@reddit
In 2038 people will look at this video and laugh while using their 10 GHz CPUs.
OddRule1754@reddit
Yeah it same like 5Ghz was magical barrier just few years ago now mainstream chips running at 5.7Ghz
floorshitter69@reddit
I didn't think 5ghz was feasible on a aircooler.. Effort was focused on multithreading for a long time, too. Now there is significant interest in each core efficiency, we're probably closer now than we think to 10ghz.
Glittering_Power6257@reddit
Back when I built my PC (Haswell), 4 GHz was largely relegated to overclockers. Now 4 GHz easily exists in smartphones with the prospect of breaching 5 GHz soon.
Sol33t303@reddit
But wasn't the world record overclock of a CPU 10GHz?
That proves it's not impossible.
Cubanitto@reddit
I hear an video is thinking about making a 50 90 TI, It can hit a thousand Watts
x7_omega@reddit
2005: 90nm IBM Cell peaked at 5GHz, 1.3V
2007: 65nm IBM Cell peaked at 6GHz, 1.3V
"Implementation of the CELL Broadband Engine in a 65 nm SOI Technology Featuring Dual Power Supply SRAM Arrays Supporting 6 GHz at 1.3 V" (IBM at 2007 IEEE ISSCC).
2026: 1\~2nm, \~0.65V
"10 GHz CPUs are impossible."
Probably. But only because of "lesser sons of greater sires", not because of physics or architecture.
Quealdlor@reddit
I remember reading in 2013 that 20 TB 3.5" HDDs are physically impossible. And here we are in 2026, when 44 TB HDDs are being sold and 100 TB HDDs (as well as 1000 TB SSDs) are on the horizon by around 2030. Although there certainly are some limits somewhere, for example, it may well be that 1000 TB 3.5" HDDs are indeed impossible and 360 TB is the maximum for 3.5".
x7_omega@reddit
In a CPU, the limits are well understood. For it to be fast, it must be physically small, not huge. A 1mm² die with simple architecture (early 2000s) is estimated to be capable of 15\~25GHz clock. That is why Cell could be so fast: it was not so simple, but compared to modern 1000+ instuction monsters it was simple. There are algorithms and programs that are sequential and get no benefit from multicore monster CPUs - for that use case a 20GHz simpler CPU (or core within a monster CPU) would be far better. Smaller die is easier to cool, voltage can be higher without overheating, and so on. It would also be cheap, as such a small die means very high yield and tens of thousands of chips from a single wafer.
blueredscreen@reddit
Not a good idea to claim that anything could be impossible in this industry, so long as it does not violate the laws of physics. And even those have changed.
NullStringTerminator@reddit
Well probably start seeing photonic computing (computing using light) in dedicated devices such as graphics cards soon, the technology is already in development and is currently being used in some specialized applications. Photonic computing allows for much faster clock speeds and significantly lower power draw. After the technology has proven itself it will likely be implemented in other components such as RAM and the CPU.
Starks@reddit
Nonetheless, Intel deluded themselves to thinking Netburst would scale to 10 GHz. Tejas was a serious proposal.
Quealdlor@reddit
They still managed to 100x performance from January 2000 Pentium III to January 2011 Sandy Bridge 2600K. And then double it again with i7-5820K and then double it even again with i5-12600K.
Alarchy@reddit
Deluded yes, but Netburst did make it to 8ghz 20 years ago. Took LN to do it, but it's still pretty impressive.
Verite_Rendition@reddit
Do keep in mind that Netburst's ALUs were double-pumped, as well. So while the chip was running at 8GHz overclocked, the ALUs would have been running at 16GHz!
ifred@reddit
Tejas and Jayhawk never even had a functional tape out. They only made it to the TDP test before throwing it in the bin. The writing was on the wall when the Israeli team was updating the Pentium3 design for mobile and the Pentium 4 580 @ 4Ghz was canceled.
Sometimes I like to daydream about a world where BTX won out, we saw Cedar Mill Pentium Vs on Netburst2 running at 10ghz, and ATI was still making beautiful hardware.
hackenclaw@reddit
nobody accept 250w CPU back then. If they do, Pentium 4 would probably run a few more years before hitting the 250w mark.
UGMadness@reddit
I don't think the PSU and VRM tech was there yet for such high power draws. A few percentage points in the efficiency of a PSU at 1000W can mean doubling or tripling the heat generation of that unit. 3.5" HDD stacks and 5.25" optical disk drives also choked airflow inside PC cases with no easy solution.
DemoEvolved@reddit
Quite a lovely video, good illustrations and narration. But it fails to specifically say why 10Ghz is actually impossible. It says 8Ghz puts out a lot of heat. Ok but impossible heat?
Pound_Potential@reddit
I expected a more thorough explanation. Boy, was I underwhelmed.
Prompt any LLM to give you the script to an explanation to the initial question. Prompt it to shrink it down to a 5mins YouTube video and understandable by non-physicists.
Voila, the initial question is not even answered properly. Instead, throw in the popular moores law and some quantum tunneling and expect the viewer to be satisfied.
Works for some people I guess…
Forsaken_Arm5698@reddit (OP)
yeah, it wasn't a very good video, but the topic is interesting, which is why I posted it here, and it has sparked a great discussion.
3G6A5W338E@reddit
At some point, got to create a time compression zone encompassing the CPU.
IgnorantGenius@reddit
I feel like they will just put multiple 2.5ghz processors on one chip like they do now, but then offset the clock by a quarter clock for each chip and claim the end result is 10 ghz.
ConTron44@reddit
read this as 10Hz oops. give me slow puter
Impeesa_@reddit
"Why 10Hz CPUs are impossible, probably: People would not buy them today because that is very slow."
jun2san@reddit
This guy's videos are great. I recommend subscribing to his channel if you're interested in this stuff.
g33ksc13nt1st@reddit
Linus:
- Hold by beer...
(probably)
Loose_Skill6641@reddit
to scale frequency super high they'll need a different type of transistor
Wait_for_BM@reddit
Changing transistors won't help much when there is significant RC delay in the traces connecting them (interconnect delay).
As you shrink the dimensions in finer nodes, you are also reducing the cross-sectional area of the wires - increasing the resistance. Without going to super conductor/carbon nanotube type of exotic materials, there isn't too much you can do to reduce that.
https://www.appliedmaterials.com/us/en/blog/blog-posts/challenges-to-interconnect-scaling-at-3nm-and-beyond.html
Basic geometry & physic: https://www.vlsisystemdesign.com/interconnect-scaling-trends/
Pyglot@reddit
Localized 10 GHz isn't a huge issue per se. But at scale it isn't worth it. Both PDP and EDP are likely more optimal at lower frequency, except for really busy circuits like SerDes.
PastaPandaSimon@reddit
I remember as a kid it was a pretty commonly repeated idea that 2ghz and beyond would be physically impossible. In the days of 100-200mhz CPUs, the only thing with a frequency of a gigahertz we could comprehend were waves, like in a literal microwave.
The idea that we won't reach 10ghz in CPUs sounds much less legit than that even.
doscomputer@reddit
if you're thinking there will be something actually novel or interesting in this video, its not here, but you get an offer to buy their book
nothing about the physics or true limitations of silicon integrated semi-conductors
trejj@reddit
This video is a great example of how youtubers chasing video ideas and interesting narrative storytelling will lead to factual inaccuracies that over time leads to historical revisionism. So sad. :(
s4nk1@reddit
this is a simplified view, the underlying reason for the GHz limit has to do with the material science of silicon and how it interacts with EM waves (electron mobility), other materials can bring down the V and C in that equation and one day give us CPUs with frequencies over 10 GHz
Mac_NCheez_TW@reddit
Guess no one told them about AM3 4 cores hitting 10ghz.
Aggravating_Cod_5624@reddit
10 GHz CPUs are impossible, because of low energetic silicon band-gap.
RBeck@reddit
Then maybe they're possible with Graphene or GaN processors.
Aggravating_Cod_5624@reddit
https://www.linkedin.com/pulse/bismuth-revolution-how-new-metal-chip-outperforms-could-hattangadi-8gaec/
Aggravating_Cod_5624@reddit
Yes, they are possible in that way, but the issue here is that current technology of production is not compatible to work with such materials.
The nearest possible materials which we could replace Silicon is Bismuth + Selenide and this is it.
Tower21@reddit
Using current technology.
In my less than half century of existence, the amount of times a novel technique has changed the existing landscape is almost unfathomable.
So, I guess we will see.
Artistic_Unit_5570@reddit
for tech , the world is ready to Physical challenge for more powerful CPUs but at the end 10Ghz cpu was not worth it there no really that much faster multi core is better
shableep@reddit
Unless they figure out photon based computers. Recently, last year, they figured out the photon based transistor. So it’s actually on its way, though still a long ways away.
Farbenzentrum@reddit
I have heard people are working on sub Thz computing using polaritrons/spintronics at uni
Appropriate_Name4520@reddit
So the original Crysis will never run perfectly? :/
EmergencyCucumber905@reddit
Intel has one in the works. It's called NetBurst.
RandomGuy622170@reddit
Impossible is just a word to the mad scientist! I chose the impossible. I chose Rapture!
100GHz@reddit
10x that is actually possible.
Flies away in the sunset