Is 2× Intel Arc Pro B70 worth it for local agentic LLMs, or should I stay with NVIDIA?

Posted by Zuck7980@reddit | buildapc | View on Reddit | 15 comments

I’m planning a home workstation that I can access remotely from an iPad through Jupyter/SSH/Tailscale. My goal is to run local agentic workflows using Hermes Agent/OpenWebUI/Ollama/vLLM-style tooling, mostly to avoid relying on cloud models.

The idea I’m considering:

- 2× Intel Arc Pro B70 32GB internally
- RTX 5070 externally through a Sonnet eGPU box for gaming
- Windows as my main OS, possibly Linux dual boot for AI workloads
- 128GB+ system RAM

My concern is software ecosystem maturity. I know the B70 hardware/VRAM is attractive, but most local LLM serving and agent frameworks seem more mature around NVIDIA/CUDA. I’m not sure whether multi-Intel-dGPU serving with vLLM/OpenVINO/OVMS is practical enough for daily use.

Questions:

  1. Would you buy 2× B70 for local LLM/agent work today?
  2. Is Intel Arc multi-GPU serving mature enough, or is this still experimental?
  3. Would I be better off with a single NVIDIA GPU with less VRAM but better software support?
  4. Does anyone here actually run local agents on multiple Intel Arc dGPU hardware?
  5. Should I wait for RTX 5080 Super / higher-VRAM NVIDIA options instead?