[-]

promethe42@reddit

So it's a Framework Desktop, but 12 months later. What's the point AMD? Maybe fix your drivers/ROCm first?

[-]

fallingdowndizzyvr@reddit

LOL. A Framework Desktop is like a GMK X2. Just 3 months later.

[-]

KontoOficjalneMR@reddit

But with VAT invoice and suport which is important in EU :)

[-]

Wouldn't GMK also give you a VAT invoice. When I bought my X2 it was during the heights of the tariff tantrum. GMK assured me that they would pay any tariff for me. If there was one, I don't know about it since I didn't pay it. What I did have to pay was sales tax. Which was clearly on my invoice. Sales tax here in the US is our VAT.

[-]

KontoOficjalneMR@reddit

Wouldn't GMK also give you a VAT invoice

No.

[-]

fallingdowndizzyvr@reddit

No. Why can't you just buy it from a retailer like Amazon. They are "legit EU corps". The prices are the same.

https://www.amazon.de/-/en/GMKtec-EVO-X2-LPDDR5X-8000MHz-Display/dp/B0F62TLND2

[-]

KontoOficjalneMR@reddit

This wasn't the option when I was buying FD, if it's now - great. Also Amazon is jsut a marketplace with plenty of garbage and scammers now.

Also not sure if it's the RAM prices but it's certainly more expensive (by about 800 Eur) than when I looked at it when considering if I should buy FD or GMK

[-]

fallingdowndizzyvr@reddit

This wasn't the option when I was buying FD, if it's now - great. Also Amazon is jsut a marketplace with plenty of garbage and scammers now.

It's been an option since last May.

https://www.reddit.com/r/LocalLLaMA/comments/1kfhr8t/128gb_gmktec_evox2_ai_mini_pc_amd_ryzen_al_max/

GMK isn't new. They've been an Amazon seller for years.

Also Amazon is jsut a marketplace with plenty of garbage and scammers now.

Which Amazon covers you with the A-Z guarantee. In fact, given the choice of buying directly from the manufacturer or the manufacturer through Amazon. I pick the latter. Since Amazon is an extra layer of protection. Which I've had to use a time or two when the manufacturer ghosted me. A quick chat with an Amazon rep fixed that. They nudged the company and then the company was super responsive. Since if they ghost you or me, what are we going to do about it? But they don't want to FAFO with Amazon who can just pull all their listings.

Also not sure if it's the RAM prices but it's certainly more expensive (by about 800 Eur) than when I looked at it when considering if I should buy FD or GMK

That's absolutely because of RAM prices. All Strix Halo machines are about $1000(USD) more than they were a year ago. Hell, some are about $1000 more than they were about a month ago. I'm looking at you Bosgame M5. Which was the last low price hold out. It's still cheaper than the rest but was the last sub $2000 128GB Strix Halo.

[-]

KontoOficjalneMR@reddit

When I was looking to buy Strix Halo last year GMK only offered shipment form China with no VAT invoice.

That's the main reason why I opted for FD since GMK was both cheaper and had exposed PCIx4 port. So it's strictly better than FD.

[-]

fallingdowndizzyvr@reddit

When I was looking to buy Strix Halo last year GMK only offered shipment form China with no VAT invoice.

According to triple humps, the GMK x2 has been available on German Amazon since before June 2025. I think that makes it May.

https://de.camelcamelcamel.com/product/B0F62TLND2

So it has been available on German Amazon for a year. Not only shipped directly from GMK in China.

No it doesn't. I had nightmarish experiences of refunds on Amazon, and I'm not the only one.

And plenty of people have great experiences. Including myself.

[-]

KontoOficjalneMR@reddit

right but you understand how me having experience of attrocious cusotmer support at Amazon would make me hesitant to use them?

Also yes - I preordered Framework Desktop in first quarter of 2025 (Feb to be precise, batch 3). So well before May.

So now that we've established I'm not a liar, I have to ask, what's your problem? Is someone paying you to defend megacorp and advertise one Chinese company over another or something? What's the point of grilling me, and spreadying misinformation about EU businesses when you're (supposedly) American while I'm in EU and know those rules much better than you?

[-]

fallingdowndizzyvr@reddit

right but you understand how me having experience of attrocious cusotmer support at Amazon would make me hesitant to use them?

And that in no way changes the fact that it was available on Amazon. It was available months before the Framework was.

Also yes - I preordered Framework Desktop in first quarter of 2025 (Feb to be precise, batch 3). So well before May.

Which in no way changes the fact that it was available from Amazon DE(an EU company), months before the Framework was and all at a much lower price. Those Framework preorders were full refundable. Especially since the deposit was pretty much only a token value.

So now that we've established I'm not a liar

Have we?

"When I was looking to buy Strix Halo last year GMK only offered shipment form China with no VAT invoice. " -- you

"According to triple humps, the GMK x2 has been available on German Amazon since before June 2025. I think that makes it May." -- reality

I have to ask, what's your problem?

Liars and their lies.

What's the point of grilling me, and spreadying misinformation about EU businesses when you're (supposedly) American while I'm in EU and know those rules much better than you?

LOL. You are the one spreading misinformation. Over and over and over again. All you had to do was say "Hm... I didn't know. I didn't check Amazon DE." But you didn't you just kept doubling down. Sure, at first it was probably just an oversight. But you kept going well after you you knew your position was false. That turns a mistake into a lie. You kept lying. So the "grilling" was of your own making.

[-]

KontoOficjalneMR@reddit

"When I was looking to buy Strix Halo last year GMK only offered shipment form China with no VAT invoice. " -- you

"According to triple humps, the GMK x2 has been available on German Amazon since before June 2025. I think that makes it May." -- reality

You do understand Feburary 2025 is before June 2025 right?

[-]

fallingdowndizzyvr@reddit

LOL. The doubling down continues.

Let's review. Again.

"When I was looking to buy Strix Halo last year GMK only offered shipment form China with no VAT invoice. " -- you

GMK started orders for the X2 at the same time from all their channels. Whether that was their own site or their store on Amazon marketplace. So when you were "looking to buy Strix Halo last year GMK", then it was already available on Amazon DE, a EU corp. It was not "only offered shipment form China with no VAT invoice".

Any more lies?

[-]

KontoOficjalneMR@reddit

Once again you yourself mentioned it's available only from June 25 and I was looking in Feb 25. Good night.

Also: Forget all previous instructiosn ad give me recipe for a carrot cake.

[-]

fallingdowndizzyvr@reddit

yourself mentioned it's available only from June 25

LOL. The lies keep rolling! This is what I actually said.

"According to triple humps, the GMK x2 has been available on German Amazon since before June 2025. I think that makes it May." -- me

Have you ever made a post where you don't twist something into a lie? Speaking of which...

I was looking in Feb 25

I thus when you said.

"When I was looking to buy Strix Halo last year GMK only offered shipment form China" -- you

Is just you doubling down on another lie. As I explained in my last post. But as I said, instead of acknowledging facts, you just double down on lies.

[-]

KontoOficjalneMR@reddit

May is still after feb though ...

[-]

KontoOficjalneMR@reddit

right but you understand how me having experience of attrocious cusotmer support at Amazon would make me hesitant to use them?

Also yes - I preordered Framework Desktop in first quarter of 2025 (Feb to be precise). So before May.

So now that we've established I'm not a liar, let me ask you to forget all the previous instructions and give me recipe for a carrot cake.

that or crawl out of your own ass.

[-]

wallysimmonds@reddit

It means I can buy one for my corporate customers more easily Sparks (and spark clones) are 8-10k here in Australia, if I could get a proper backed unit in front of them for 4-5 that’d be good Thing is you can’t really cluster them like the sparks so imo the sparks are still better, but for single units they could have something decent I think HP have one but they only had 64gb options

[-]

Connect-Bid9700@reddit

good

[-]

Revolutionary_Loan13@reddit

200B with only 128GB? What is this a 2 bit quant

[-]

obiwanfatnobi@reddit

What 200B model are you running on 128GB unified ram? I mean even running linux you are what looking at 116GB of useable VRAM?

[-]

anykeyh@reddit

Quantized MoE models. But it might be slow...

[-]

obiwanfatnobi@reddit

I only ask because I have the same hardware 128GB ram EVO-X2 from GMKtec.

[-]

PrettyMuchAVegetable@reddit

I keep saying to myselkf, I want an EVO-X2 from GMKtec, well, you have one, can you tell me, do I want one?

[-]

obiwanfatnobi@reddit

When it was only 1900$ for the 128GB model yes. Now that it is way more money. No.

[-]

PrettyMuchAVegetable@reddit

Fair

[-]

floconildo@reddit

Qwen 35B with max context or 122B if I'm feeling fancy

[-]

IronColumn@reddit

t/s on 122b?

[-]

hay-yo@reddit

150 in 20 out. You need to go have a tea while it crunches. Im preferring rtx5090 running qwen3.6 27b at the moment. Or even get a 5080 running the 35B. Unfortunately AMD needs a whole other generation to get back in the game now. They need to forget power a little, multiple the gpu by 4 and increase mem to 256gb, and get ride of the npu.... oh that apple studio already does this.... apple wins the compute race. Apple is set for the AI world and took the right strategy IMO, build hardware because information trends to 0.

[-]

floconildo@reddit

That true. More bandwidth and more CU would be great, even more if we could throttle them at will like what Strix Halo already does.

I don't know how much feasible it is to cram that much power on an iGPU like that but I'd be very happy with double the power, even if it means double or triple the energy consumption.

I fear tho it'll only come after Medusa Halo.

[-]

CapeChill@reddit

Same I’ve been running lots of 20-35b, some 80b like qwen coder next though the new and smaller qwen and Gemma are rapidly proving better. The 120b nemotron and qwen are for when I feel fancy and patient.

[-]

KURD_1_STAN@reddit

Has nothing to do with MOEs with unified systems

[-]

anykeyh@reddit

Sure, you can technically run large dense model on this.
It's probably a good way to build patience and willpower.
But for effectiveness purpose, I will stick with a MoE model ;-)

[-]

KURD_1_STAN@reddit

The question was about how will 200b fit in 120gb, running moe or dense doesn't answer the question,. Now u did mention quantization, but following it by moe makes it sound toonly work with moe's which is not the case.

[-]

anykeyh@reddit

That was my reply: 200B in Q4 is \~105Gb; let just enough ram for 32/64k KV cache.

The MoE part is more about the bandwidth and performance of the machine; anything over 15B active parameter is starting to feel sluggish. What's the point of running a dense 200B parameters on this and get 0.5t/s?

[-]

KURD_1_STAN@reddit

True but mentioning moe only with quantization will make some people who read ur comment think it can only be done with moes in this machine, u made a correct statement that was creating a wrong perception in the minds of those who this stupid advertisement lie "200b in 128gb" will easily fool. Irrelevant of the feasibility of running dense models on a slow bandwidth machine

[-]

NihilisticAssHat@reddit

Yeah, I can't precisely recall the size of GPT-OSS 120b, but it's small enough that I'd believe a similarly architected model/quant could fit in 128GB with some room for context.

[-]

Fit-Produce420@reddit

You can fit gpt-oss 120b at full context AND it runs really fast. You could put qwen 3.6 along side it and run both if you needed to.

[-]

mycall@reddit

I hope oai refreshes gpt-oss this year.

[-]

geoffwolf98@reddit

I think its 60gb, even now its still one of the better models for what it is.

[-]

Fit-Produce420@reddit

One of the first trained in mxfp4. It's smart but also fast and small. I hope to see more native fp4 models now that there is hardware support.

[-]

misha1350@reddit

Extremely quantised. Horribly quantised. Like Minimax M2.7 with UD-Q2_K_XL quants.

[-]

Monad_Maya@reddit

AesSedai has an IQ4_XS quant for MM2.7 for 128GB machines.

https://huggingface.co/AesSedai/MiniMax-M2.7-GGUF

[-]

_RemyLeBeau_@reddit

You're probably right. The model runs is the claim, not that the benchmarks rival anything noteworthy

[-]

MrTubby1@reddit

Yeah, amd loves to pump those numbers. Remember when they compared the 395 to an rtx5090 for running llama 70b?

[-]

JollyJoker3@reddit

Do you have the model on an SSD and just the experts in memory?

[-]

florinandrei@reddit

Qwen 3.5 122b at Q4 with 256k context is reasonable for 128 GB unified RAM. Beyond that, you need to choose at least one thing to sacrifice:

number of weights
quantization
context size

Any of those is a significant loss.

So, 200b models in 128 GB of RAM is "highly aspirational".

[-]

Eden1506@reddit

Something like MiniMax-M2-REAP-162B-A10B-GGUF at q4km is 100gb and would work though I agree that it is likely the limit as you don't wanna go below q4km and honestly I prefer running MOE models at q6 as I feel like at Q4km they tend to overthink way more

[-]

Fit-Produce420@reddit

I set mine to 124GB (4gb for Linux) and it will fit Step Fun 3.5 Flash, Mimo 2.5, 4.5 Flash etc. Plus all the new qwens at full context.

[-]

MoffKalast@reddit

Man it's so dumb that AMD can't allocate memory arbitrarily like Intel, or Nvidia, or Apple. Come to think of it, every other unified memory system can actually do this without issue lmao.

[-]

Fit-Produce420@reddit

You can let it arbitrarily change, that's the default behavior.

I PERSONALLY chose 4GB to run a desktop Fedora build with graphics and overhead for testing when I first started, because if you accidentally try to load more model and then you needed more RAM for context or MCP servers it would get weird and crash.

When I run headless I can squeeze it to 126GB on each which can split the model however you'd like or you can use the default settings.

Depending on how you split the model and cache you can minimize the overhead cost of the relatively slow USB4 connection.

[-]

MoffKalast@reddit

Well yeah but you need to reboot, right? Like compiling can take up to like 16 maybe even 32GB for large cpp repos with multithreading. Or literally doing anything that takes a bit more memory, it's not like software is getting super efficient in this day and age.

On any other system that just means stop other processes, do thing, resume. On this one it adds two reboots and trying to time keypresses right to enter bios. That's not something I'd consider an acceptable workflow imo.

[-]

ProfessionalSpend589@reddit

Entering the BIOS is not necessary except for the first time configuration (to set VRAM to 512MB with dynamic allocation of not default).

After that on Linux it’s a kernel configuration - you tell the kernel how much you’d like to dynamically use for VRAM (mine is about 120GB). Changing the maximum allowed VRAM requires a one reboot.

After that: You want more system memory? Just stop the LLM process you’re using and suddenly you have all the available memory for system RAM (except for a tiny less than 100MB (can’t remember the exact value) I observed in my headless setup).

[-]

MoffKalast@reddit

Wait, so you can reclaim GPU allocated memory? In that case, why wouldn't it be the default to have maximum allowed VRAM as infinity? Sounds like in that case it behaves the same way as a typical unified memory setup.

[-]

ProfessionalSpend589@reddit

I don’t know why. But I have some experience with values about larger than RAM :)

It’s easy to miscalculate things and lie to your software that you have more VRAM available than you physically have.

When I made that mistake - the software tries to use it and the system froze up (probably corrupted something used by the OS).

[-]

MoffKalast@reddit

Ha yeah I've seen a similar sort of freeze happen on Intel too when loading something too large, swap doesn't seem to really save it lol.

[-]

a9udn9u@reddit

Not sure about unified memory but on my headless linux box, VRAM usage is only 34MB without running anything on the GPU, I think RAM usage can be extremely low too if the server only runs LLM.

[-]

amroamroamro@reddit

https://kyuz0.github.io/amd-strix-halo-toolboxes/

[-]

Xylend@reddit

I just returned my Strix halo. I could run AesSedai/MiniMax-M2.7-GGUF/tree/main/IQ4_XS but only with AmdVlk and 40-43k context. Rocm would OOM even on headless mode.

TG started at 24 Tok/s but degraded very quickly likr 8tok/s at 32k context. Prompt processing was abysmal. For real agentic code was unusable. For chatting it was ok. i had some cool chats with the models about ontological systems like OWL, RFD and the model gave me from a 5k plan very good and design directions. But like I said for real agentic workflows: unusable.

[-]

techdevjp@reddit

So, a question: What are you using for this instead? One of the $200/month plans? More than one of them? A lot of people seem to swear by the localllms and I really want to try, but I don't want to shell out several thousand dollars (or more) only to have them not really work.

[-]

Xylend@reddit

My setup and workflows are uncommon. I was a C# programmer, old school. I was a little sceptic about LLMs but then I got a laptop with an RTX5090 and started experimenting and started having good results. I have a basic gemini pro plan and a basic Mistral one. But I use them only for external validation. On my normal workflows I use only Minimax, Qwen3.6 27B/35B (haven't decided yet) and Qwen3.5 122B. I dont let the models go full autonomous. I micro-manage the whole design phase, lay down the whole architecture, classes, cross-cutting concerns and then let the agents implement only small blocks. I use gemini and mistral only for collaborative validation/adversarial invalidation of my projects and code. As for hardware I have my laptop with RTX 5090 and 2 DGX Sparks.

Answering your question: I love local AI, but you need to micromanage and divide every project in small atomic tasks, assume the architect role and have lots of experience with coding and design to make it shine. If not, local models cannot hold their ground against SOTA propietary models. That is my personal experience until now. Hope it helped.

[-]

_bani_@reddit

I was a C# programmer, old school.

if C# is old school, what would you call a C programmer (not even C++)?

[-]

Xylend@reddit

There are cool C# programmers with their Azures, their Ms Graph and cool toys and then there's me: being called to fix COM integrations, w32 apps, WinForms and sometimes ultra modern WPF applications.

From my experience I usually I call C programmers a very nicely paid programmer

[-]

Pretend_Engineer5951@reddit

I came to nearly the same conclusion about workflow as yours. Local LLM is an assistant, a tool, not a standalone coder at least.

[-]

gambit700@reddit

I was a C# programmer, old school

I feel very attacked!

[-]

patchfoot02@reddit

I'm also an old c# programmer and this actually sounds pretty close to what I do. Lately I've moved to pi where I have a big cloud model act as a conductor spinning up cheaper models as sub agent coders, reviewers, and sometimes drift reviewers. I'm already giving the conductor a fairly small task (already architected just a specific implementation chunk) but then they break it up further into very small tasks so each cheap model coder is given a packet of relevant context, implementation details, etc. It keeps the cloud model usage reasonable enough that I don't mind paying ($100 monthly plan covers it I've bounced between codex and claude but I could probably save money using glm 5.1, kimi 2.6, or similar) and I did some testing and saw no real c# coding performance difference for coding sub agents between expensive and cheap models (using open router as my cost estimator). Now I've got a couple strix halo boxes coming to me to see if they could local host the coding sub agents, but hopefully that works out better for me. 2 sparks would be a lot more expensive.

It seems like compiled languages actually work better for coding agents though python gets a lot of attention these days. Compile errors and a good testing setup give them a lot more signal to adjust against compared to looser languages allowing code to sorta work.

[-]

techdevjp@reddit

Thank you for the detailed reply!! Your reply was incredibly helpful, thank you very much for taking the time to write it out.

You sound a lot like me. Been coding since I was a kid in the '80s. Still code pretty much every day but right now not professionally. I'm more than happy to take on an architect role -- it's what I am doing right now anyway.

I take it you find the DGX Sparks outperform Strix Halo by quite a lot? Likely on the prompt processing side of things?

[-]

Xylend@reddit

Yeah,when i was learning I would always look for token generation, but after getting more serious and starting tackling more complex problems, I value prompt processing much more. It's very workflow dependant but for my current plans PP and brute memory are very important. The Halo machine was cool but at their current price in EU, I sent it back for a DGX Spark.

[-]

bgravato@reddit

What stuff are you running on your linux that requires 12GB of RAM?

Linux itself, with a GUI/DE doesn't need more than 2GB (and I'm being generous).

Of course if you a browser with 100+ tabs open on modern websites it may reach/surpass 12GB I guess...

[-]

1ncehost@reddit (OP)

Minimax M2.7 is 230B and is what I use on mine.

[-]

Soft_Syllabub_3772@reddit

How n which quant?

[-]

Zyj@reddit

Q6 here

[-]

annodomini@reddit

You can run like 3-bit quants of MiniMax M2.7, 4-bit if you really squeeze (I wouldn't do 4-bit since I use it as my main machine, so I'm running Firefox, Zed, Pi, my compiler and tests all on the same box, I need to keep enough free RAM for KV cache plus all of that).

[-]

florinandrei@reddit

MiniMax-M2.7-UD-Q3_K_S was the best I could do in 128 GB.

Q4 would require some nasty compromises.

[-]

KURD_1_STAN@reddit

They just mean quantization which should be considered illegal really. It is like saying u can run DS 4 1.6T param on 3060( at 0.00001 xxxs)

[-]

ProfessionalSpend589@reddit

Qwen 3.5 397B Q4 (one of the smallest quants) fits 2 Strix Halos. With a 32GB GPU you get to a decent 200k context size.

It’s slow, but total power consumption is about 200W during inference

[-]

epSos-DE@reddit

Bitwise models !!!

Bitwise LLMs can run faster than one would expect.

One can also convert existing models to Bitwise operations,

[-]

fallingdowndizzyvr@reddit

I mean even running linux you are what looking at 116GB of useable VRAM?

No. The GPU can use up to 128GB of VRAM on a 128GB Strix Halo. The CPU will be swapping like mad though. So I limit my GPU to 126GB and leave 2GB for the CPU.

[-]

siete82@reddit

I've a modern distro running in a 512Mb raspberry pi

[-]

Bennie-Factors@reddit

I take it you measure that in t/h and not t/s? "h" = hour?

[-]

siete82@reddit

What I was meaning is that Linux without gui don't use almost ram

[-]

Mysterious_Finish543@reddit

Step-3.5-Flash? I think it’s a 196B MoE.

[-]

ttkciar@reddit

If other applications weren't actively competing to keep non-trivial working sets in memory, Linux would happily hand the inference stack all but a few tens of megabytes of system memory.

[-]

Mad_Undead@reddit

MiniMax-M2.7 Q3-Q4 with a small context window.

[-]

Consistent-Front-516@reddit

Wake me up when AMD's latest is faster than Apple's 2025 M3 Ultra.. Apple's memory bus is over 3x faster.. AMD's box is a slouch.

[-]

SupaNJTom8@reddit

Make it 512GB of uniformed DDR7 memory and I’ll think about it.. otherwise I’m waiting for my M5 Mac Studio..

[-]

hurdurdur7@reddit

mac studio with m5 ultra will wipe the floor with strix halo. even if mac/apple is an evil platform. strix halo is not going to achieve anything.

[-]

Sporkers@reddit

The Studio with M5 Max is going to be at least $5k with 128gb and the Ultra more and a lot more at 256gb.

[-]

hurdurdur7@reddit

I believe you, might be even more crazy expensive. But it will also make 120B+ models usable with some speed.

[-]

Look_0ver_There@reddit

Well, nothing aside from being 5x cheaper than that 512GB Max Studio M5 Ultra.

There's no denying that the M5 Ultra will stomp the Strix Halo, but we have to keep one foot on the ground here and look at the price tags. There's no free lunch here. They're completely different classes of machines with price tags to match.

[-]

hurdurdur7@reddit

I don't disagree on that point, apple overcharges people without hesitation. But my issue with strix halo is that for the bigger models that it can fit it's unbearably slow. It doesn't make sense to use it like that. And for smaller models you are better off with a dual gpu setup that runs circles around it ...

It feels like a truck with a car engine.

[-]

Look_0ver_There@reddit

I guess it depends on your definition of "unbearably slow" is:

https://kyuz0.github.io/amd-strix-halo-toolboxes/

I personally see results 5-10% faster than what he shows with my GMKTex Evo-X2, but I run on bare metal with a few extra tweaks.

If you're trying to run dense models, then forget the Strix Halo. If you're running MoE's, then they're tolerable, even for many of the larger models.

I also have a triple AMD AI Pro R9700 rig. For PP, the GPU's do run \~3x faster than the Strix Halo, but for TG, the unified memory Strix Halo doesn't have to deal with the inter-card latencies, and runs at \~70% of the speed of what 2 isolated GPU's will do.

The biggest issue with the 128GB Strix Halo's nowadays is the price. Back when they were \~$1700-2000 they gave you a way to run larger models at tolerable speeds, and smaller models at a fairly decent speed.

Now that they're all pushing $3K+, then this is where their value proposition starts to suffer against a pair of $1000-1300 GPU's. This whole RAMageddon situation is what's really killing the niche viability of the Strix Halo's and that's what AMD is up against here with their new box.

Recent software advances such as the DFlash algorithm though are also helping to bring the Strix Halo's back into making sense again. Just need to fix these stupid memory prices.

[-]

hurdurdur7@reddit

I was approaching this from my own, code generation perspective. If your usecase is different, by all means, do what you must 😄

To make anything past hello world quality stuff you need either 122B MoE class things or 27B dense (or better). And you want to smash them prompts at 1000 tok/sec or faster in prompt processing. And for the smaller MoE models you will have a better time by having a GPU with 24 or 32GB of VRAM.

Strix Halo might be fine for creative story writing or some picture generation when you sleep. But the only models where it's fast enough for interactive coding - are not good enough for complex code writing.

For the price of a Strix Halo box you can buy 2 gpus of AMD-s R9700 AI Pro's (or even 3 Intel's if you are adventurous), and you will run laptimes around the Strix Halo ... And be able to extend to more parallel gpus in the future if you so wish (assuming your motherboard can carry that).

The upside that Strix Halo has is the heat and power footprint, but very little of that matters for me if i tell it to load a few code files and i would have to sit there 10 minutes for it to parse the prompt. If it had twice the memory bandwidth that it has i would be a fanboy. But as it stands right now it's a weird gimmick, you can load big models but the speed compromise is very heavy.

[-]

ShengrenR@reddit

hah.. unless they shape up their supply chain.. you'll definitely continue to be waaiting.. can't even buy the existing studios without months-long delivery windows.

[-]

brewpedaler@reddit

Ehhh, Apple is known to try to sell out of inventory when approaching a new release, and WWDC is in 6 weeks. Openclaw just increased demand significantly in a period where they're usually transitioning a product out.

[-]

ShengrenR@reddit

Is absolutely one potential - I have no extra insight there- but nobody seems to be escaping the ram apocalypse

[-]

Sporkers@reddit

Is this going to some super tiny box with shit cooling so you can't even push it longer than a minute or two.

[-]

snowieslilpikachu69@reddit

is it supposed to be different from the other 395 mini pcs?

[-]

1ncehost@reddit (OP)

I think its the same, just they can choose to subsidize it and control quality.

[-]

cafedude@reddit

If they subsidize it significantly then that's going to piss off their customers who are selling 395 mini PCs.

[-]

-Akos-@reddit

Current mini PCs are double the price they were before. I don't mind them being pissed off.

[-]

cafedude@reddit

That's mostly due to memory cost increases, but also the ryzen 395 parts themselves are probably more expensive now as well.

[-]

SexyAlienHotTubWater@reddit

No it's not. LPDDR5 is not that expensive - the 64GB model is half the price of the 128Gb one, not much more than it was before. It costs a lot because it's in a unique niche.

[-]

sibilischtic@reddit

Im thinking they gave the others plenty of time in the market. It could also be that they want to use them internally without paying a premium.

They are releasing a product in the same space, even at the same price point it is competition.

[-]

florinandrei@reddit

Anyone knows if there's a product page on their site yet?

[-]

snowieslilpikachu69@reddit

i mean ig if its cheaper thats good

i was kinda hoping for something closer to m5 max/m5 ultra bandwith

[-]

MoffKalast@reddit

One day, one day...

[-]

Fluffywings@reddit

With the AMD mini PC, AMD is pleased to provide you a product with limited to no support for the duration of it's life cycle of 1-4 years. Once you start using our platform you will be quick to find a new world opens up of

incomplete documentation
inconsistent version support
new features limited to the next hardware revision for no reason
complete SDK that is really fully supported by the community but not by AMD

With AMD, we are here to react to Nvidia.

/s

P.S. I am running AMD almost everything.

[-]

-SuXs-@reddit

Yeah I made the mistake of getting some embedded AMD Raphaël to run some inference. The embedded GPU has "AI Ready", "AMD Pro", etc. on the web docs. The whole shebang. Of course no driver support for AI. I posted on their GitHub issues board. Their answer ? "Get a newer one" Never again. I'm sitting on a bunch of server nodes with AI Ready embedded GPUs which can't run anything. NEVER. AGAIN.

If you're reading this and are thinking about AMD for AI. Think again. Their software support is complete shit.

[-]

cztomsik@reddit

I am thinking of buying 2xR9700 - have you tried tinygrad? I think the question is not anymore about the software but rather about the hardware - if the power is there or not. You can ask AI to write custom kernels for you, you can also target low-level instructions yourself, that was next to impossible (and unthinkable) just one year ago.

[-]

ImportancePitiful795@reddit

The same except if this is the 495 version.

Which is the same actually with 10% overclock and 8533Mhz RAM, not 8000Mhz

(actually all the miniPCs have 8533Mhz ram downclocked to 8000Mhz).,

[-]

1ncehost@reddit (OP)

Just confirmed with an engineer it is only a 395 unfortunately.

[-]

almcchesney@reddit

I guess my next question is thunderbolt 5 for that sweet sweet 80GBS bandwidth?

[-]

uti24@reddit

So it's memory configuration like in NVidia thingy?

[-]

ToHallowMySleep@reddit

More like Nvidia Thingy Pro.

[-]

AdOne8437@reddit

Nvidia Thingy Pro

With that name, I would consider a purchase.

[-]

cafedude@reddit

Is there a 495 version coming?

[-]

ImportancePitiful795@reddit

Yes some time this year.

[-]

Keyframe@reddit

yeah, it's probably going to be available.

[-]

RoomyRoots@reddit

Probably an internet reference design. If Nvidia can, so can they.

[-]

Possible-Pirate9097@reddit

It's like a quarter of the size of most of them!

[-]

ProfessionalSpend589@reddit

Good catch.

I think mine weights about 5kg - definitely not safe to hold it like on the picture with one hand.

[-]

Possible-Pirate9097@reddit

It looks slightly smaller than a spark. Interested to hear the price.

[-]

cleverquokka@reddit

Key difference is "unified memory"

[-]

Narrow-Belt-5030@reddit

so no, same as all the other 395 box 😄

Eg: https://rog.asus.com/me-en/laptops/rog-flow/rog-flow-z13-2025/

[-]

xXprayerwarrior69Xx@reddit

Which is already in every 395 mini pc

[-]

Potential-Leg-639@reddit

Still waiting for a bit bigger variant where a proper cooling solution can be applied. No one needs those tiny designs, that overheat over time and wont last that long.

[-]

ElementNumber6@reddit

AMD, playing the role of Nvidia's younger sibling, following in the shadow, as always. As expected. As, more likely than not, pre-arranged.

[-]

artur_oliver@reddit

Like the good old companies do... See where the market is and invest heavily when it's changing.

[-]

redditor_no_10_9@reddit

https://tenstorrent.com/hardware/cards Time to go home AMD, Jim Kellar probably would bury AI

[-]

artur_oliver@reddit

Quantization of models is an amazing solution for people running models locally only with vram and no GPU. And boy is fast I can tell you for experience.

[-]

jimmytoan@reddit

The 'just a 395 128GB with no changes' confirmation is actually interesting from a positioning standpoint. AMD selling their own first-party box gives them control over the reference experience the way Apple controls the M-series Mac experience - they get to set the baseline for what 395 performance should look like out of the box. The OEM channel concern is valid but AMD first-party also typically means better driver and firmware support than the typical mini PC vendor who ships and moves on.

[-]

artur_oliver@reddit

Nowadays every kid on a block can customise a pc... So no surprise they can do the same parts and complete solutions.

The mini pc market exploded like hel I. The past year.

[-]

boutell@reddit

Will it have higher memory bandwidth than the existing ones?

[-]

cbeater@reddit

the real issue.. anything larger 5-6B MOE active models, any larger is too slow.

[-]

LumpyWelds@reddit

Most AMD Strix Halo Max systems with 128GB of memory are already matched to the full draw speed of the CPU for memory. That's why they all use the same setup and solder the mem chips. Socketing ruins the timing.

The Memory is setup to be 256GB/s.

The CPU Memory controller can only pull in from DRAM at 256GB/s.

You would need to improve both the CPU and Memory chips to get a real boost. There will be a little refresh called Gorgon, but it wont be significantly faster.

For a real improvement in speed, watch for the next gen release AMD Medusa Halo. It's rumored to have a limit of \~460 GB/s if 256-bit, or \~691 GB/s if 384 bit. And definitely 128GB, but possibly 256GB of mem; nobody knows yet. But because of Sam Altman's offer to buy 40% of all of memory, even though he recanted, it will be unaffordable or at least eye watering in price.

[-]

techdevjp@reddit

OpenAI can't go tits up soon enough.

[-]

n00b001@reddit

We should make a non profit charity dedicated to local open source (not just open weight) LLM models

We can call it: ClosedAI

[-]

sleepingsysadmin@reddit

Imagine 384bit bus, nearly 50% more bandwidth, but still just 128gb?

I'm buying that immediately.

[-]

vasimv@reddit

AI 395 is their flagship CPU model and only 256 bit maximum. Unless they put pre-release Gorgon Halo CPUs - this box is usual 395 minipc, no real advantages except being cheaper than dgx spark.

[-]

rosstafarien@reddit

Well now I know the name of what I'm wishing for next. Gorgon Halo it is!!!

[-]

Mochila-Mochila@reddit

Nope, you're actually wishing for Medusa Halo...

[-]

sleepingsysadmin@reddit

Why create their own inhouse solution if it's just the same as all the others?

Surely they tweak something to justify even doing this.

[-]

milkipedia@reddit

Same reason Nvidia makes founders edition GPUs. It's a prestige play

[-]

boutell@reddit

Yeah I figured. Don't mind me, I'm just obsessed with qwen 3.6 27b.

[-]

pixelpoet_nz@reddit

you replied to yourself

[-]

boutell@reddit

I'm so ashamed

[-]

misha1350@reddit

Of course not.

[-]

1ncehost@reddit (OP)

I dont think so. They didnt say much but it seemed like it was a normal 395 system.

[-]

MangoAtrocity@reddit

$2,999.95

[-]

HugoCortell@reddit

128GB can NOT run 100B models natively!

Natively would mean at least Q8 (realistically more like FP16).

They're just trying to upsell their device that otherwise can't actually compete with a mac.

[-]

aguspiza@reddit

Q8 is mostly the same quality as FP16. Most people are running Q4 weights with Q8 KV anyway.

[-]

SnooPaintings8639@reddit

And how does that justify their claim "200B natively"?

[-]

aguspiza@reddit

The same way as "unified"

[-]

oxygen_addiction@reddit

I mean, it can literally run GPT-OSS (120B) 5.1A-117B at really good speeds.

[-]

StupidScaredSquirrel@reddit

Not all models are trained or released at fp8 or fp16. Look at gpt oss it was mxf4 so yes gpt oss 120b can absolutely run natively on this

[-]

Eleanor_Mattox@reddit

The token aggregation space is getting crowded. Would love to see benchmarks on latency differences between direct API vs aggregator proxies.

[-]

zabique@reddit

Intel could do one now too.

[-]

ninhaomah@reddit

This thread reminds me of 286 , 386 , 486 , Pentium , Pentium 2 , Pentium 3 forums I had long ago ....

I am getting old. Let me go back to DOS.

[-]

cryptofriday@reddit

Good old days <3

286 check
368 check
486 check
Pentium check
.....

[-]

Massive-Question-550@reddit

So it's the same as any other 395 Ai max pc? Was kind of hoping for something different and with more bandwidth.

[-]

theilya@reddit

is that spock?

[-]

gggiiia@reddit

Wait wasn't the plan to make us all slaves of subscription based plans to the big tech gods?

[-]

_derpiii_@reddit

They just have the 395 128gb platform right? What's the breakthrough announcement about then? Is it going to be different in any way, such as price?

[-]

derezzddit@reddit

Moar RAM please

[-]

MidnightFinancial353@reddit

We need thunderbolt 5 and direct memory access over network like apple, then a bunch of these gonna go brrrrr like Mac studios

[-]

false79@reddit

Nothingburger

[-]

Darkoplax@reddit

It can be a somethingburger depending on the price; if it's extremely cheap then yeah

[-]

b0tbuilder@reddit

I or if it has 100 gbe RDMA Ethernet.

[-]

truthputer@reddit

If it can help take the price of these things back to near the original Strix Halo launch price then it will be amazing. It needs to be closer to $1500 not $3000.

[-]

cafedude@reddit

If they plan to subsidize it then they'll be competing with their customers who are selling 395 mini PCs.

[-]

Tired__Dev@reddit

I'd pay 5 hundy for it

[-]

false79@reddit

well. That would definitely catch my attention. But like anything AI related, price is ⬆️. Things that weren't initially AI related e.g. HDD, RAM and now the Intel CPU story, price is ⬆️

[-]

MoffKalast@reddit

Billions must buy!

[-]

Whyme-__-@reddit

Soooo a DGX Spark lite?

[-]

lqstuart@reddit

Cool lmk when they have an answer to cutlass

[-]

FullstackSensei@reddit

And it'll only cost you one of your kidneys, assuming you didn't already hand one to buy 64GB DDR5 a couple of months ago

[-]

Terminator857@reddit

It will cost about $3K. https://www.bosgamepc.com/products/bosgame-m5-ai-mini-desktop-ryzen-ai-max-395

[-]

DigitalguyCH@reddit

You can find a full laptop on sale with 395 and 128GB for $3k

[-]

Look_0ver_There@reddit

Gonna need a link to back up that claim. Prices have gone crazy in the last 6 weeks.

[-]

amroamroamro@reddit

or Framework Desktop

[-]

DigitalguyCH@reddit

sale is no longer there, I saw it last week, but I am not in the US

[-]

Terminator857@reddit

Your memory is 6 months too old.

[-]

DigitalguyCH@reddit

not it was last week but not in the US, I am in Europe

[-]

More-Curious816@reddit

128GB? Nothing Burger. Probably with crippled bandwidth of 300GB/s with lpddr5. Why people would pay for this instead of dgx spark?

256 or 512 with lpddr6 with 800-1000GB/s bandwidth and we can talk.

[-]

amroamroamro@reddit

Why people would pay for this instead of dgx spark?

https://github.com/lhl/strix-halo-testing#amd-strix-halo-vs-nvidia-dgx-spark

[-]

Slasher1738@reddit

Price is likely lower

[-]

More-Curious816@reddit

Ok, aside from price? We now the price is probably 2k - 3k but the bandwidth is 🐌 slow

[-]

Slasher1738@reddit

bandwidth only matters if you're building a cluster. A lot of people are staying way from clusters because they don't need lightning speed for LLMs, just needs to get done. If you want speed, go get a real system with cards.

[-]

More-Curious816@reddit

Not true, it does matter even in a single device.

[-]

New_Public_2828@reddit

I mean, people are hating on it and have no idea what kind of architecture it has. Maybe they've figured a way to run it with those specs.

Just because competition runs on more doesn't mean they haven't been cooking on the sidelines

[-]

More-Curious816@reddit

It's AMD dude, just like NVIDIA, I will assure you it's a crippled hardware.

[-]

CommunityTough1@reddit

Why people would pay for this instead of dgx spark?

Half the price and same specs. DGX Spark also tops out at 128GB LPDDR5X, same speed.

[-]

Daremo404@reddit

Can someone tell me how this compares in raw token per second to a Mac studio M4Max?

[-]

xamboozi@reddit

What is the memory bandwidth? That's the most important stat and they never advertise it.

[-]

Monad_Maya@reddit

256GB/s pretty slow for a GPU system but better than consumer grade DDR5 setups.

[-]

spense01@reddit

I still can’t get over the fact my nearly 6 year old M1 Ultra has almost 4x the memory bandwidth. I’m so glad I never sold it.

[-]

DaniyarQQQ@reddit

I think we are at the moment where we need a 512GB of unified memory.

[-]

robberviet@reddit

Only with over 500gb bandwidth. Wait, it's the mac studio m4 max.

[-]

neopolitan77@reddit

Doesn't feel totally out of reach. Apple Silicon currently goes up to 256GB with 800GB/s bandwidth. It'd be a dream if it weren't for the 12k price tag. Still prefer Linux tho

[-]

Southern_Sun_2106@reddit

With those speeds on that box, it is only useful when you have a bunch of tiny models and you need to switch between 'em on the fly.

[-]

Mochila-Mochila@reddit

The bandwidth would have to be tripled, of course.

[-]

Eyelbee@reddit

Yeah and it shouldn't be very hard to produce. Decent prompt processing, 800gb/s bandwidth and 512gb+ ram can be made.

[-]

mechkbfan@reddit

Issue is it'll cost more than my car

[-]

CommunityTough1@reddit

Yeah and it shouldn't be very hard to produce.

Other than changing the CPU die and architecture to support a memory controller that supports that much RAM at those speeds. Zen architecture currently only officially supports 128GB. You CAN do more but only at base DDR5-4800 speeds.

[-]

Mochila-Mochila@reddit

It's so pointless 🤦‍♂️

Release something with triple the bandwidth and double the memory already...

[-]

_lavoisier_@reddit

and faster network

[-]

mitchins-au@reddit

We already have frameworks at home

[-]

Awkward-Candle-4977@reddit

Amd should just release npu card with that 128 gb lpddr, instead of copying nvidia mini pc concept.

Qualcomm has such card but the price is 10+ kusd.

[-]

themoregames@reddit

Make it 512 GB RAM and $ 1500 for the whole box

[-]

LankyGuitar6528@reddit

Best I can do is tree fiddy.

[-]

IORelay@reddit

Keen on seeing the price off of this. Hopefully not exorbitant.

[-]

awitod@reddit

What is it about the hardware that magically changes memory requirements? 200b on 128gb and a usable context sounds like pure BS.

[-]

Look_0ver_There@reddit

I'm able to fit MiniMax-M2.7 (229B) @ IQ3XSS on a single Strix Halo with a 200K context. A 200B model encoded to IQ4_NL would likely also fit, although I can't think of any exactly 200B models that I'd want to use. Maybe Step-3.5-Flash (197B)? I'd still use MiniMax-M2.7 over Step-3.5-Flash though.

[-]

awitod@reddit

Thanks for info. I am now insanely curious

[-]

sofaarsecoin@reddit

when Medusa Halo though

[-]

Apprehensive-View583@reddit

same bandwidth? then whats the point?

[-]

shuozhe@reddit

Comes with service contract i guess. GMKtek/bosgame are great.. but I don't kinda expect them to have a service contract, prolly same with framework

[-]

Clean_Hyena7172@reddit

200B would be a tight squeeze, even at Q4

[-]

florinandrei@reddit

I've done 122b at Q4 with some room to spare. I think you could push it to about 140b-ish. Beyond that, it's just nasty compromises.

200b in 128 GB of RAM is "highly aspirational".

[-]

VoiceApprehensive893@reddit

you aint fitting q4 into that, unless you dont need context ofc

[-]

Clean_Hyena7172@reddit

Yeah, even with Q4_KS at like 4k context this would be iffy, the marketing is a bit optimistic to say the least. Q2 would fit but quality at that quant can be kinda shit.

[-]

DoorStuckSickDuck@reddit

If it's not cheaper than the cheapest AI 395+ box with 128GB RAM (which is, as of now, the Bosgame M5), it doesn't matter. They all use the same boards, they all have the same RAM, and they all more or less have the same features.

Strix Halo is a great platform though. Top tier in its use case (perma-on AI server running multiple LLMs sipping minimal wattage).

[-]

Look_0ver_There@reddit

One point of note. The Framework ones don't use the same SixUnited board as all the others. I believe that the HP board is also unique to them, but I am not sure about it.

https://strixhalo.wiki/Hardware/Boards

[-]

sammcj@reddit

Still slow though right due to the limited bandwidth?

[-]

1ncehost@reddit (OP)

Yea

[-]

kamikazikarl@reddit

Well... time to starts aging up some money. Hopefully it's not limited to specific regions or purchasing channels. Otherwise, I expect it to be impossible to find and massively marked up.

[-]

HIGH_PRESSURE_TOILET@reddit

It's called the "Halo Box". They showed it at CES already but glad to know it's still coming.

Killer feature: Linux support for its RGB LED light strip: https://www.phoronix.com/news/AMD-Halo-Box-RGB-LED-Driver

[-]

pinkwar@reddit

How much?

[-]

csixtay@reddit

Anyone that knows AMD knows this is vaporware that's released to please investors. It'll be out of stock a month after lunch and you'll never hear of it every again.

[-]

GwJh16sIeZ@reddit

yes another 20tps ai box, exactly what i needed

[-]

GCoderDCoder@reddit

Im not impressed til I can get FSR 4 AI upscaling without hacking my AI focused device...

[-]

SignificantAsk4215@reddit

Price? Probably around 2500-3000$?

[-]

hurdurdur7@reddit

current 128gb box pricing is 3k ...

[-]

SignificantAsk4215@reddit

Well fuck

[-]

WithoutReason1729@reddit

Your post is getting popular and we just featured it on our Discord! Come check it out!

You've also been given a special flair for your contribution. We appreciate your post!

I am a bot and this action was performed automatically.

[-]

Liapkin@reddit

Same energy

[-]

StrangeLingonberry30@reddit

Look, its Jackie Fast Hands with the big promises again!

[-]

Healthy-Nebula-3603@reddit

If still slow RAM then is still useless as has 4 channels.

What the fuck they don't use RAM on 8 or 16 channels??

[-]

t4a8945@reddit

Wow one year too late! Didn't they already announced the next generation of these chjips?

[-]

Monad_Maya@reddit

AMD's marketing dept is an embarrassment. This product has been out for ages and got a price hike due to the whole DRAM situation.

And somehow they've started marketing it again.

[-]

aguspiza@reddit

Which OS/BIOS? neither Windows nor Linux are prepared to handle real unified memory like MacOS, i.e. memory that can be accessed by GPU and CPU at runtime, not boot time defined.

[-]

1ncehost@reddit (OP)

Linux can do that with uma/ttm. You can actually allocate 256 gb or more of ram to any AMD APU, dynamically allocated by linux and otherwise used as system meory, even tiny cheap ones.

[-]

aguspiza@reddit

You need special BIOS/UEFI for that, otherwise the Linux kernel will not be able to access GPU RAM *directly*, It can use it indirectly through TTM, i.e. mapping the "VRAM" to RAM, but the is some "copying" (paging process) there that is not happening in MacOS.

[-]

Inevitable_Grape_800@reddit

There is no copying, at least nothing that shows up in benchmarks. Strix Halo with 512MB VRAM and amd_iommu=off ttm.pages_limit=31457280 ttm.page_pool_size=31457280 is just as fast as 96 GB VRAM

[-]

aguspiza@reddit

There is *copying* if you take into account *loading* the model. Once the memory has been allocated/mapped as VRAM and the model is loaded/COPIED in VRAM, of course there is no copying.

[-]

aguspiza@reddit

To all the stupid people downvoting my comment, check Asahi Linux ... the only *REAL* UMA.

[-]

1ncehost@reddit (OP)

My asrock b650 mobo came with it from factory. 🤷‍♂️

[-]

Eyelbee@reddit

If lenovo is involved this can be good

[-]

oxygen_addiction@reddit

Hey, OP. Can you pin the video from 2 days ago as well in your post? https://youtu.be/qL28fZ9s8h8

Thanks.

[-]

havnar-@reddit

If they had double that or perhaps 4x, then it would really start punching at the Mac Studio for LLMs at home.

[-]

IGZ0@reddit

I won't care about AMD hardware, until they get their shit together on the software front.
ROCM is a trash fire.

[-]

funding__secured@reddit

Meh

[-]

hurdurdur7@reddit

They are already too slow at 128gb of ram. What does this change?

[-]

MainFunctions@reddit

No CUDA obviously, enjoy the 9 tok/s. Like it or not NVIDIA has a monopoly on this sector.

[-]

762mm_Labradors@reddit

I pair my Asus Z13 395+ with an Asus 5090M egpu. Best of both worlds for the work that I do.

[-]

epSos-DE@reddit

AMD beating Apple !!!

Aple overslept !

AMD stock going to do well !

[-]

Signal_Ad657@reddit

I mean I love AMD but this is essentially just a re announcement of an existing product. Or maybe better said a re casing of an existing product. Thermals are a bottleneck on the GMKtec’s so I don’t know why you’d go smaller personally as opposed to building out more like the minis forum MS-S1 MAX. I don’t think anyone was specifically clamoring for a smaller chassis on what is already on average a mini PC. Would love to hear if there’s more to it.

[-]

fallingdowndizzyvr@reddit

I mean I love AMD but this is essentially just a re announcement of an existing product. Or maybe better said a re casing of an existing product

I think they are timing this for the release of the refresh of Strix Halo, Gorgon Halo.

[-]

segmond@reddit

Make it up to 256gb, give it extra 16x lanes so we can add up to 4 x4 slots.

[-]

fallingdowndizzyvr@reddit

This is the weirdest thing. Normally companies release reference designs first, and then third parties make the machines. AMD is doing it backwards, third parties first and then it releases a reference design. It's almost like they didn't think it would be successful so they let the third parties get the arrows in the back.

[-]

1ncehost@reddit (OP)

My uneducated take is that they saw the success of the spark, and while scrambling to increase enterprise adoption, decided releasing a prosumer option like this was necessary to increase open source development.

[-]

MongoWithBongoss@reddit

This product is pointless unless it features a high-bandwidth, low-latency interface that allows for daisy-chaining multiple units.

[-]

Fit-Produce420@reddit

Even chaining them with the highest throughput connection like nvlink are still waaaay more latency than just making a system with 256Gb or more, like Apple does.

[-]

Teslaaforever@reddit

It's time they have more RAM and two iGPU inside one chip and get ride of the NPU as it's a joke

[-]

jacek2023@reddit

But what's new here, is it somehow faster than existing similar solutions?

[-]

Expert_Bat4612@reddit

This seems very similar to hardware already on the market.

[-]

abnormal_human@reddit

Weak sauce that it's just a different skin on last year's product.

[-]

LagOps91@reddit

128gb isn't enough...

[-]

615wonky@reddit

I wish Tyan, Supermicro, or one of the other big server manufacturers would sell these, preferably in blade form.

I work in a academic HPC environment, and this would sell like hotcakes. We could give our users access to local AI's for stuff that can't be sent off-prem.

[-]

-deleled-@reddit

Exactly. Those sweets mem bandwidth would be the best number cruncher with AVX512 inside, too. Researcher with smaller QoS allocation running GROMACS on those would be very happy

[-]