5090 OpenCL & Vulkan leaks
Posted by segmond@reddit | LocalLLaMA | View on Reddit | 16 comments
Ack, not crushing 4090.
https://videocardz.com/newz/nvidia-geforce-rtx-5090-appears-in-first-geekbench-opencl-vulkan-leaks
a_beautiful_rhind@reddit
Nothin gonna happen until FP4 takes off like FP8 did and neither of those are really an "llm" thing.
No_Afternoon_4260@reddit
May be not with llama.cpp, but something like a onnx might use it? Idk
a_beautiful_rhind@reddit
new kernels will use it. probably in image models first. Could be a requirement for SuperHappyFunTime Attention that gives you 4x context.
I see people didn't like this comment even though it's true. I could give a fuck about FP8 until it was needed on image models and could happily keep using 3090s for LLM tasks with a rather negligible speed penalty.
No_Afternoon_4260@reddit
I think these kind of acceleration are used in llm space when using big batch sizes, else your really aren't compute constrained anyway
a_beautiful_rhind@reddit
Unless you use FP8 context or weights you don't really use them. It also helps prompt processing in that case. Sure the extra compute helps batches but without the precision optimizations it's not so massive. Even nvidia was not showing that in marketing materials.
Hence little reason to get a 4090 over a 3090 here, etc. Let alone shell out for a 5090 and it's 8 extra gigs of ram.
On video/image models it hurts not being able to compile FP8 (and future FP4) weights in torch and getting 1/3 to 2x the speedup from that.
fallingdowndizzyvr@reddit
26-37% more performance with 33% more VRAM for a 25% price increase. IMO, that's pretty crushing it.
LunarianCultist@reddit
For a 33% power increase... Depending on how prices go, buying two 4090's or hell... Three to four 3090's might be better!
fallingdowndizzyvr@reddit
As prices go right now, you aren't buying 2x4090s for the price of a 5090. You are buying 1.25.
The force that pushed 4090s well above MSRP is no longer a factor. That being China. Which was hoovering up as many 4090s as it could before the ban. Since the ban went into effect, 4090s at MSRP have been the norm. It should be no different for the 5090.
kmouratidis@reddit
In the 3 European countries I lived & looked at the prices, they were typically ~2500€ or more. Only used ones go close to MSRP, which is insane.
Then again there are people selling 3090s used for mining at ~~5090 prices~~ scam prices.
TheRealGentlefox@reddit
Isn't OpenCL the more relevant benchmark here? That would put it at closer to 10%
fallingdowndizzyvr@reddit
No. Since OpenCL is pretty much obsolete. The standards group that governs OpenCL is pushing SYCL as it's replacement.
OpenCL in llama.cpp has been deprecated and they say to use the Vulkan backend instead.
TheRealGentlefox@reddit
Interesting, thanks.
joninco@reddit
People gonna be real disappointed when it's unobtainable for less than 5k, though still competitive pricing with the 48GB models.
ThenExtension9196@reddit
Yup. As sad as it is, I’m pretty sure I’m going to be paying some scalper.
ThenExtension9196@reddit
Absolutely crushing the 4090. This kind of time savings is literally cash money.
Pro-editor-1105@reddit
"4090 performance for a third of the price"