First Intel B580 inference speed test
Posted by ComprehensiveQuail77@reddit | LocalLLaMA | View on Reddit | 20 comments
Upon my request someone agreed to test his B580 and the result is this:
Posted by ComprehensiveQuail77@reddit | LocalLLaMA | View on Reddit | 20 comments
Upon my request someone agreed to test his B580 and the result is this:
MrTubby1@reddit
This is for stable diffusion. It won't accurately reflect LLM performance.
segmond@reddit
yup, looks like the RTX 3060 will still be a better buy.
fallingdowndizzyvr@reddit
As discussed in this thread, the 3060 is about the same speed. But considering that the 3060 is well Nvidia and can also run video gen that the B250 can't, it is a better buy.
https://www.reddit.com/r/LocalLLaMA/comments/1hf98oy/someone_posted_some_numbers_for_llm_on_the_intel/
Monkeylashes@reddit
Nvidia is winner in gaming front as well, due to a ton of features like ray tracing, slash, dlss...
fallingdowndizzyvr@reddit
No. It's not. Look at the benchmarks. The B250 demolishes the 3060 and even soundly beats the 4060. The Nvidia cards that are competitors. Even when you factor in ray tracing and XESS.
Monkeylashes@reddit
you are right on raw performance but my point stands. Nvidia has a lot tricks in their software to push the cards over the edge. They don't just rely on raw performance. And if you're including the 40 series cards in your comparison, then there is frame generation too, which almost doubles the perceived performance. It isn't so simple a comparison as you're making out to be.
Cyber-exe@reddit
The 4060 is suffocating on low VRAM, the 3060 even at 12gb is earlier gen tensor cores that barely do half of what the 4060 does. The 40 series lacks the AI TOPs to use DLSS4 that the 50 series is getting so there's no new tricks in the pipeline for the 3060. Nvidia maybe could put DLSS4 on the 3090 and 4070 TS and above but the lower GPU's within those gens will be behind the entire RTX 50 lineup for AI TOPs
fallingdowndizzyvr@reddit
And Intel has those tricks too. XESS is no slouch.
fallingdowndizzyvr@reddit
As discussed in the thread a month ago, the 3060 is about the same speed. But considering that the 3060 is well Nvidia and can also run video gen that the B250 can't, it is a better buy.
Here are the numbers for the 3060. Compare the to the B250 numbers I posted in my other response.
twnznz@reddit
It would be good to know which optimisations are in use. E.g., Flash-Attention.
fallingdowndizzyvr@reddit
Here's the numbers from someone that ran llama.cpp on their B250 from a month ago.
pyr0kid@reddit
you realize stable diffusion literally isnt the same software?
getmevodka@reddit
so my two 3090 are good still 😬👍
fallingdowndizzyvr@reddit
Hardly first. I posted a thread with numbers that someone made with llama.cpp on their B250 a month ago.
"
"
cchung261@reddit
That seems a little slow.
CystralSkye@reddit
How are the amd cards tested here? They look disproportionately slow, is it rocm on linux?
Finguili@reddit
Most likely Windows. I’m getting 8.66/min on a 6700 XT on Linux—which is still rather slow when you compare it to nvidia, but over 2x faster than what is listed here.
CystralSkye@reddit
Would also very much depend on the model used, and parameters, wouldn't it?
I have a 6700xt and a 4070, and the difference in comfy ui is more like 2x instead of 8x as this chart suggests.
Finguili@reddit
Yes, I used the 1.5 model and 512x512 resolution with 50 steps on an image, but had to assume Euler as the sampler. Though, the article does say they used the DirectML fork of A1111, so it’s definitely not using ROCM on Linux.
CystralSkye@reddit
Yea I've seen this quite often.
Like ollama with rocm on windows does quite well on the 6700xt. 80 toks on phi3.