Anyone got Gemma 4 26B-A4B running on VLLM?

Posted by toughcentaur9018@reddit | LocalLLaMA | View on Reddit | 8 comments

If yes, which quantized model are you using abe what’s your vllm serve command?

I’ve been struggling getting that model up and running on my dgx spark gb10. I tried the intel int4 quant for the 31B and it seems to be working well but way too slow.

Anyone have any luck with the 26B?