NVIDIA RTX Spark — Slim Laptops & Small Desktops
Posted by zxyzyxz@reddit | LocalLLaMA | View on Reddit | 54 comments
Posted by zxyzyxz@reddit | LocalLLaMA | View on Reddit | 54 comments
SuddenRadio6221@reddit
If they can deliver this with native linux (no x86 emulation) for under $4K, I'd be pretty interested.
Ornery_Hall@reddit
DDR6 will come out in 1-2 years, which is designed to handle higher pcie bandwidth between RAM, CPU and GPU. I rather wait for it than getting this crippled ARM Spark processor that can't do anything beside running basic model. Plus ARM and Microsoft partnership never workout from their past reputation.
hurdurdur7@reddit
So their massive announcement was just a laptop design for a worse version of windows with the same crappy memory bandwidth in a laptop form factor that gets too hot if you use it for the slow ai experience.
I don't mind arm cpus. But this thing will flop.
Weird-Field6128@reddit
It is indeed a flop, you can't even run Linux on that. Idk what the hell they are thinking
z_latent@reddit
I highly doubt they'd lock you to Windows, be this a Microsoft partnership or not. Feel free to provide confirmation otherwise if you have it.
Also, the fact it's ARM won't stop you either, in the worst case you can use FEX like you would on a Steam Deck and run x86 programs.
Weird-Field6128@reddit
Maybe you are right, i am not that tech savy, people will find a way to do it eventually.
z_latent@reddit
It's OK, if anything this would be good news. Again, I also am not sure of the details of their partnership, but I doubt they would/could stop it.
You can have Linux on a MacBook, good luck to Nvidia if they actually wanted to lock people into Windows!
sittingmongoose@reddit
Well apparently it was supposed to launch 2 years ago but software issues and small hardware bugs delayed it. So I imagine the price they wanted to launch at 2 years ago is dramatically different than the price they can launch at now.
Probably will start at $3500 for the highest SOC and were probably expecting it to be $2000 back when it was supposed to launch.
Mountain_Patience231@reddit
i suspected those so called 100 fps gaming were also some AI slop frames
Ok_Warning2146@reddit
From the news report, it seems like it is just a DGX Spark with windows.
RestInProcess@reddit
I wonder how this would work because the DGA Spark is a very hot running system.
g_rich@reddit
Seems more like a DGX Spark with no ConnectX-7 NIC and a lower TDP that can run Windows and likely at the very least DGX OS.
Guinness@reddit
The 128GB ram limit kills it for me. I want 768GB of unified RAM at 1024GB/sec capable of at least running GLM/Kimi.
Fairchild110@reddit
GB300 is the answer to your wants.
Limp_Classroom_2645@reddit
ugh 😒
Django_McFly@reddit
it's not like the OS that you want to run cost money to install. $0 and like an 30 minutes and the solution is perma fixed. Hell, install pi and ask pi to do it. $0 and no time loss from your life.
Limp_Classroom_2645@reddit
nostriluu@reddit
These would actually be interesting if they had proper eGPU support.
Fluxx1001@reddit
So is this basically the mobile version of the DGX Spark?
Serprotease@reddit
With a 90w power limit and mention of 5070 mobile level of performance that’s probably something that slot in between an amd ai max and a m5 MacBook Pro.
The interesting feature of the dgx are the networking capabilities, relatively high prompt processing speeds and cuda. This only kept cuda and I’ll bet that the prompt processing speed will not compare favorably to the M5max.
The sff pc version could be interesting if the price is around the amd boxes.
HokageSupreme1@reddit
By the time laptops with the N1x launches, it’ll be close to when the M6 Max MacBooks launch. M6 Max will have improved neural accelerators, which will improve prefill speeds. Benchmark from Geekerwan’s channels as seen in this image shows that the M5 Max matches the RTX 5090 laptop is token generation, despite the M5 Max having worse memory bandwidth.
jimogios@reddit
RTX spark is N1X: https://www.notebookcheck.net/Nvidia-N1X-officially-confirmed-to-arrive-as-the-RTX-Spark.1312010.0.html
HokageSupreme1@reddit
I know, but I believe there is higher and lower end version(N1 and N1x). RTX Spark is the family of products. “N1x” is just what I call the specific chip that I am referring to.
Serprotease@reddit
I’ll be cautious when reading these “benchmarks”.
No information on the backend, context, number of samples, etc… this is doesn’t really tell you a lot.
And you may want to double check the numbers here.
The M5 pro with better number than the M3 ultra is… strange. And a simple check brought up numbers around 400-440 tk/s pp for the m5 max with the same model.
HokageSupreme1@reddit
For the M5 Pro with “better number” than M3 Ultra. That “number” is prefill, and it makes sense due to M5 Pro having neural accelerators, which also gives the M5 Max the large boost in performance it got over M4 Max in prefill. On the Mac, the model used is Qwen 3.5 27B Q4 via MLX, while the RTX 5090 laptop ran the same model and quantization in llama.cpp (non-MLX). The source is Geekerwan’s video named “Apple M5 Pro & M5 Max Review: The Most Powerful MacBooks Ever”, and in that video, they reviewed many aspects of the latest MacBooks, and how they compare to other laptops. More details are in the video. What is the source of your “simple check”? What quantization did you use, and was it an MLX model?
jcdoe@reddit
I dont trust results without prompts and responses. It could be spitting out pure slopium and we’d be impressed at how fast it does it!
DapperCucumber@reddit
They mentioned it in the video this was from, 4 bit for both the 3.6 27b and 3.5 122b. MLX on macs, llama.cpp for nvidia. No mention on context, but its likely fresh prompts going by the numbers which roughly line up with the perf metrics for 4bit quants of these models on MLX and cuda.
Django_McFly@reddit
dgx spark in razor thing laptops will probably be mindblowingly expensive. Most computers from major consumer players that have "AI" in the title are, generally, terrible at all the popular AI use cases. i'm not particularly happy about an even more expensive spark but at the minimum, it's part of a trend of like actually useful for AI "AI computers"
false79@reddit
This is pretty much exciting news if you are the uninitiated.
Other than that, wow that was disappointing
kiwibonga@reddit
We're supposed to believe in FP4 support on consumer hardware? How many years after release will the engineers start working on drivers?
Mountain_Patience231@reddit
Isnt nvfp4 supported in all Blackwell gpu already?
kiwibonga@reddit
Well, that's the joke. The driver that finally supports FP4 came out last week, in May 2026, for consumer GPUs that came out January 2025.
FullOf_Bad_Ideas@reddit
ok so there's support for FP4 on consumer hardware now, right?
yes apparently it landed
kiwibonga@reddit
Yes, on paper. I haven't been able to use it yet though, the bleeding edge releases of popular inference engines crash or fall back to a slower path still (for me), even though compilation now happens fine. It's likely going to get better in the coming weeks.
FullOf_Bad_Ideas@reddit
Doesn't vllm support NVFP4 since almost half a year now on consumer hardware (5070, 5080, 5090)?
kiwibonga@reddit
No, the advertised "CUTLASS" kernels did not compile until last week. Anyone who claimed to be using NVFP4 on those chips before wasn't correctly reading the console output.
FullOf_Bad_Ideas@reddit
ah makes sense
LORD_CMDR_INTERNET@reddit
Yes
Mountain_Patience231@reddit
Still have no idea what you are complaining about, while clearly you should be able to use FP4 in RTX Spark since day one.
Formal-Exam-8767@reddit
I thought being ARM-based limits it's usability for anything outside AI. Do current Windows games support ARM?
Cane_P@reddit
It's basically not different to Steam Deck being able to run Windows games on Linux... Simulation, but some native ports will be released too:
https://www.windowscentral.com/hardware/surface/microsoft-surface-laptop-ultra-announced-computex-2026
cobbleplox@reddit
I don't think running windows games on linux through proton is comparable. That is basically just implementing the windows api on linux and then the windows exe runs basically natively on linux.
asfsdgwe35r3asfdas23@reddit
It is almost the same concept. Proton (wine + dxvk) translate windows call and directx calls to Linux/vulkan. Windows ARM does the same, it translates x86 to ARM on the fly. You can run any x86 software in ARM, the issue is that since both architectures are different, this translation is not very efficient, and unlike proton that sometimes can even run faster on Linux than Windows. X86 to ARM translation has in the best case scenario a 20% performance degradation, and can be much more for some software. So it is a good solution for legacy software and allowing a transition, but you really want native ARM applications.
SkyFeistyLlama8@reddit
There are actually 2 different subsystems that translate x86-to-ARM64 and x64-to-ARM64 code in Windows. x64-to-ARM64 directly maps x64 Windows system calls to their ARM64 equivalents so performance loss is minimal. Both use code caches (XTA) that save translated code so the next running of that translated app should be a lot faster.
I've used Windows on ARM for years now and it's good enough to run almost all enterprise software. Typical stuff like Office 365 either have native ARM binaries or run under translation (Visio and Access). More niche apps like Azure tools and Power BI Desktop all run fine under emulation.
Funnily enough, ARM Linux runs fine under Windows' WSL using a customized kernel under Hyper-V. The only issue I've had is getting machine learning frameworks to run under Python in ARM Linux because the build toolchains are a mess. Qualcomm doesn't have the expertise to handle that; Nvidia certainly does, so I'm hoping Pytorch and other ML libraries can finally run under ARM WSL.
Formal-Exam-8767@reddit
So they plan to sell device with 20% performance degradation out-of-the-box?
asfsdgwe35r3asfdas23@reddit
Yes, same as Qualcomm did. With the hope that in the future developers will support ARM chips. Although some software, such as AI inference libraries support ARM already because Nvidia and AWS have been releasing ARM chips for a while
cobbleplox@reddit
Yeah, I read your stuff, just in time compiling. I just think having that job to do on the fly and running different machine code for the actual program is a fair distinction that makes it quite a different thing, even "normie facing", so I wasn't a huge fan of the comparison.
Cane_P@reddit
That's an explanation for the normies. That's why I said "basically".
Cane_P@reddit
They use Windows own Prism emulator. If your read the sourceses that I provided, then you would have known.
FullOf_Bad_Ideas@reddit
Now that's what I think could be called a real AI-enabled laptop, not like those Copilot+ laptops that can't actually do anything more than run benchmaxxed Phi Silica or tweak your camera background.
bakawolf123@reddit
So they presented laptop chip and the towers were actual DGX stations, nice "personal" computing for $100k
brrrrreaker@reddit
These "AI PCs" are the hardware equivalent of a Trial version, except in this case you pay an insane price as well. Barely enough for the applications to start and play around, but as soon as you try to use it for real, it's too slow, too dumb, and you run to the hosted service. As long as nvidia sells datacenter chips, you'll never see an actual AI PC.
rajwanur@reddit
I think “Up to 128 GB Unified Memory” makes it a direct response to AMD Strix Halo/Medusa Point.
crossoverXYZ@reddit
specs look decent but real-world inference speed with quantized models is what matters. curious to see actual benchmarks from users