HP ZGX Nano G1n (DGX Spark)
Posted by contactkv@reddit | LocalLLaMA | View on Reddit | 27 comments
If someone is interested, HP's version of DGX Spark can be bought with 5% discount using coupon code: HPSMB524
KooperGuy@reddit
Dogshit purchase
valentt@reddit
what is a better purchase and why is this shit?
KooperGuy@reddit
No thanks
Kubas_inko@reddit
You can get AMD Strix Halo for less than half the price or Mac Studio with 3x faster memory for 300 USD less.
aceofspades173@reddit
The Strix doesn't come with a built-in $2000 network switch. As a single unit, sure the strix or the mac might make more sense for inference but these things really shine when you have 2, 4, 8, etc in parallel and it scales incredibly well.
colin_colout@reddit
ohhh and enjoy using transformers, vllm, or anything requires CUDA. i love my strix halo, but llama.cpp is the only software i can use for inference.
The world still runs on CUDA unfortunately. The HP Spark is a great deal if you're not just token counting and value compatibility with Nvidia libraries.
If you just want to run llama.cpp or ollama inference, look elsewhere though.
Kubas_inko@reddit
You can run vllm with Vulcan on strix.
colin_colout@reddit
Ok...can you help me understand how? vllm mainline has no vulkan support.
I'm pulling my hair out here... I've heard others on reddit say vllm supports vulkan, but I can't find that anywhere.
Maybe youre confusing it with rocm or HIP implementation, or maybe llama.cpp which has a vulkan backend?
...but good news is vllm rocm supports sooo many models now (gpt-oss and qwen3-next!)
Months ago it was nearly useless unless you like llama2, so I'll walk back _some_ of my compatibility concerns (it's still a huge issue but at least support is trending in the right direction).
colin_colout@reddit
thanks! just learned this (gonna try it out).
last i tried, i think i was using rocm directly and no modern models were supported.
bobaburger@reddit
depends on what OP gonna use the box for, if anything that needed CUDA, it's what the price for.
anyway, OP, merry xmas!
the pricing is not much differet from spark, is $200 discount worth it though? :D
Kubas_inko@reddit
They are posting this on locallama, so I don't expect that.
stoppableDissolution@reddit
People on locallama also train their models, which is slow but doable on spark and virtually impossible on strix, for example
Kubas_inko@reddit
Why is it impossible on strix? All training frameworks are only cuda based?
stoppableDissolution@reddit
Pretty much, yes. You can train on cpu, but its going to take a few eternities.
bobaburger@reddit
aside from Local LLMs, r/localllama is actually a place where ML/DL enthusiasts without a PhD degree gather talking about ML/DL stuff as well đ
MontageKapalua6302@reddit
Can the AMD stans ever stop themselves from chiming in stupidly?
waiting_for_zban@reddit
I think the DGX sparks are rusting on the shelves. I know very few professional companies (I live near a EU startup zone), and many bought 1 to try following the launch hype, and ended up shelving it somewhere. It's no where practical to what Nvidia claim it to be. Devs who need to work on cuda, already have access to cloud cuda machines. And locally for inference or training, it doesn't make sense on the type of tasks that many requires. Like for edge computing, there is 0 reason to get this over the Thor.
So I am not surprised to see prices fall, and will keep falling.
Aggravating_Disk_280@reddit
Itâs a pain in the ass with arm cpu and a cuda gpu, because some package doesnât have the right build for the Plattform and all the drivers are working in a container
aceofspades173@reddit
have you actually worked with these before? nvidia packages and maintains repositories to get vllm inference up and running with just a few commands.
Aggravating_Disk_280@reddit
Yes I got one from my employer. Itâs okay if you just want to spin some (v)LLMs up, but if you want to do some training and needing some older packages itâs a nightmare. Often they only have the Mac arm version build
Miserable-Dare5090@reddit
Dude, the workbooks suck and are outdated. containers referenced are 3 versions behind for their OWN vllm container. itâs ngreedia at its best. again, check the forums.
It has better PP Than the strix or mac. i can confirm i have all 3. GLM4.5 air slows to a crawl on mac after 45000 tokens (pp 8tkps!!) but stays around 200tkps on the spark.
KvAk_AKPlaysYT@reddit
Why not halo? Just curious.
aceofspades173@reddit
made a similar comment above but these have a \~$2000 connect X-7 card built-in which makes them scale really well as you add more. comparing one of these vs one strix halo doesn't make a whole lot of sense for inference. there aren't a ton of software and hardware options to scale strix halo machines together where the spark can network at almost 375GB/s semi-easily between each of them which is just mind boggling if you compare speeds between PCI-e links for GPUs in a consumer setup
Sufficient_Prune3897@reddit
Lol. If you have the money for multiple, why not just RTX 6000s?
KooperGuy@reddit
$2000 LOL
Miserable-Dare5090@reddit
I have one. Check the nvidia forums...the connect between them sucks, not currently going above 100G and a pain to do. they promised âpooled memoryâ but thats bs. it wonât do RDMA.
fallingdowndizzyvr@reddit
The Asus one is $3K for the 1TB SSD model.