Gorgon Halo is 6.7% faster than predecessor Strix Halo

Posted by Terminator857@reddit | LocalLLaMA | View on Reddit | 14 comments

Gorgon Halo: 8533 MHz memory, Strix Halo 8000 MHz. AI workloads are typically memory bottlenecked. 8000 Mhz * 1.06625 = 8533 Mhz. Conclusion: Not a worthy upgrade, best to wait for Medusa Halo, summer of next year for 50% increase in AI performance.

Previous discussion: https://www.reddit.com/r/LocalLLaMA/comments/1swiylm/comparison_of_upcoming_x86_unified_memory_systems/

AMD has not released details yet on memory bandwidth for Gorgon Halo. https://www.tomshardware.com/pc-components/cpus/amd-ryzen-ai-max-400-gorgon-halo-packs-up-to-192gb-of-unified-memory-refreshed-apu-uses-zen-5-and-rdna-3-5-and-can-clock-up-to-5-2-ghz

[-]

Bulky-Priority6824@reddit

Cool now I just need 6000% more $

[-]

Equal_Passenger9791@reddit

I will still likely buy one, depending a bit on price of course. Having up to 160gb vram opens some doors, even if it is a bit slow

[-]

Borkato@reddit

How slow is slow? How does it compare to a 3090?

[-]

Terminator857@reddit (OP)

Faster than a 3090 if it doesn't fit in 3090 vram. \~5x slower otherwise.

[-]

Borkato@reddit

Oh yeesh. So whatever I’m currently running on a 3090 with great speed would be 1/5 the speed, but I’d be able to run bigger models much faster than a 3090 can currently run with cpu offloading for those bigger models? Interesting

[-]

Civil_Response3127@reddit

It's written in the post.

[-]

Borkato@reddit

No it’s not, it’s whatever MHz is. I don’t know if that translates directly to speed of actual generation and if so how. I know 3090s have a rate of 936GB/s but I don’t think that’s the same thing. I asked because I figured someone would be able to say “about 2x” rather than me having to google it, but you can be unhelpful if you’d like

[-]

letsgoiowa@reddit

Honestly the really only good way up is HBM, and that is going to be hella expensive. Maybe an HBM caching solution like their famous "Infinity Cache?" Chip design takes years so it's unlikely but they've probably thought about it.

[-]

Terminator857@reddit (OP)

HBM is expensive because it is new. Once it is old tech, it can be cheaper than dram, because pins needed for dram on a chip are very expensive. In other words, there are only so many pins that can fit on a cpu and each dram channel requires 288 pins.

[-]

fallingdowndizzyvr@reddit

Or you can just double the number of memory channels as Medusa is reported to be. Remember Apple uses LPDDR too and gets 2-4 times more bandwidth.

[-]

geldonyetich@reddit

On the one hand 6.7% doesn't seem like much of a boost.

But the Strix Halo line it's refreshing is rather cool in that its 8060S GPU built into the APU is pulling something between the mobile and discrete versions of a GTX 4060 level.

Getting the performance of a discrete graphics card out of an APU is a pretty neat trick, if you ask me. People shouldn't sleep on this, that's going to have all sorts of applications for mini-PCs, laptops, and things like Steamdecks.

Of course from an AI standpoint the benefit is an APU can benefit from the unified memory structure, which makes it a great little budget AI workstation that, in some applications, gets a lot closer to a DGX Sparx's performance than it has any right to.

Anyway, I already got an AI Max 395+ so I kinda don't need Gorgon Halo. Medusa Halo, on the other hand, is sounding like it might be ~50-80% faster.

[-]