Intel Arc Pro B70 Benchmarks With LLM / AI, OpenCL, OpenGL & Vulkan Review
Posted by Balance-@reddit | LocalLLaMA | View on Reddit | 11 comments
Review from Phoronix.
Introduction: Last month Intel announced the Arc Pro B70 with 32GB of GDDR6 video memory for this long-awaited Battlemage G31 graphics card. This new top-end Battlemage graphics card with 32 Xe cores and 32GB of GDDR6 video memory offers a lot of potential for LLM/AI and other use cases, especially when running multiple Arc Pro B70s. Last week Intel sent over four Arc Pro B70 graphics cards for Linux testing at Phoronix. Given the current re-testing for the imminent Ubuntu 26.04 release, I am still going through all of the benchmarks especially for the multi-GPU scenarios. In this article are some initial Arc Pro B70 single card benchmarks on Linux compared to other Intel Arc Graphics hardware across AI / LLM with OpenVINO and Llama.cpp, OpenCL compute benchmarks, and also some OpenGL and Vulkan benchmarks. More benchmarks and the competitive compares will come as that fresh testing wraps up, but so far the Arc Pro B70 is working out rather well atop the fully open-source Linux graphics driver stack.
Results:
- Across all of the AI/LLM, SYCL, OpenCL, and other GPU compute benchmarks the Arc Pro B70 was around 1.32x the performance of the Arc B580 graphics card.
- With the various OpenGL and Vulkan graphics benchmarks carried out the Arc Pro B70 was around 1.38x the performance of the Arc B580.
- As noted, no GPU power consumption numbers due to the Intel Xe driver on Linux 7.0 having not exposed any of the real-time power sensor data.
Whole article with all benchmarks is worth taking a look at.
ea_man@reddit
How long till someone who has some clue gets this card and run a couple LM to see how it does?
* QWEN3.5 27B
* gemma-4-31B
sniperwhg@reddit
https://www.reddit.com/r/LocalLLaMA/comments/1sbt1em/b70_quick_and_early_benchmarks_backend_comparison/
Both those models were tested in this thread here in-case you were curious.
ea_man@reddit
Yeah, nice to see that sycl does better than Vulkan, at least Intel did some work with that.
Thanks!
mr_zerolith@reddit
Too bad they didn't test them with anything but tiny or super fast models.
This seems like less than a Nvidia 4070 worth of power, and is weaker than i expected.
It looks like if you actually use the vram, you'll be punished severely.
Woof9000@reddit
why is this need breed of llama.cpp testers test hardware on all kinds of models, except the "standard" that we all used for years, as an actual reference?
FYI it's : llama 7b Q4_0
def_not_jose@reddit
...not buying mi50 for $200 half a year ago was really dumb of me, huh
sleepingsysadmin@reddit
90% of those benchmarks dont say anything useful.
44TPS out of gpt20b seems impossibly low.
An AMD 9060xt has like 300GB/s memory bandwidth and it'll do 60-70TPS for GPT20b.
This b70 should be over 100TPS.
Suggests something wrong with their testing.
spaceman_@reddit
Yes, their prompt processing (325 t/s) GLM 4.7 Flash is also leagues behind R9700 which should have similar performance (3139.52 t/s), assuming they tested at 0 context.
Dry_Yam_4597@reddit
Nothing useful - just Xx compared to other models. But people own completely different brands. How is this article relevant to me if I only own AMD or NVIDIA gpus?
spaceman_@reddit
I am currently running the only relevant model in the list (GLM 4.7 Flash) on my hardware to compare. Sadly, Phoronix only ran the benchmarks at 0 context, and GLM 4.7 Flash performance TANKS as context grows.
WizardlyBump17@reddit
Could you run llama.cpp again, but with sycl this time?
You said the 7.0 kernel doesnt expose power stuff. Did they change it from 6.19? On 6.19 I can get the power usage on the hwmon