Finally China entering the GPU market to destroy the unchallenged monopoly abuse. 96 GB VRAM GPUs under 2000 USD, meanwhile NVIDIA sells from 10000+ (RTX 6000 PRO)
Posted by CeFurkan@reddit | LocalLLaMA | View on Reddit | 703 comments

Economy-Swimming-109@reddit
about time
atape_1@reddit
Do we have any software support for this? I love it, but I think we need to let it cook a bit more.
fallingdowndizzyvr@reddit
CANN has llama.cpp support.
https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/CANN.md
ReadySetPunish@reddit
So does Intel SYCL but is still not nearly as optimized as CUDA, with for example graph optimizations being broken. Support alone doesn’t matter.
fallingdowndizzyvr@reddit
Yes, and as I have talked myself blue about. Vulkan is almost as good or better than CUDA and ROCm. There is no reason to run anything but Vulkan.
ReadySetPunish@reddit
I don't agree. Performance on NVIDIA GPUs is a lot better on CUDA than Vulkan, at least for llama.cpp. Besides, mainline PyTorch doesn't really support anything but cuda
fallingdowndizzyvr@reddit
You're wrong.
6 months ago Vulkan got real close to CUDA.
https://www.reddit.com/r/LocalLLaMA/comments/1j1swtj/vulkan_is_getting_really_close_now_lets_ditch/
4 months ago, Vulkan got faster than CUDA in llama.cpp.
https://www.reddit.com/r/LocalLLaMA/comments/1kabje8/vulkan_is_faster_tan_cuda_currently_with_llamacpp/
Vulkan has gotten even faster since.
ReadySetPunish@reddit
The post you mentioned has the OP turn flash attention off. It's cherry picking.
fallingdowndizzyvr@reddit
How's that? It's off for both CUDA and Vulkan. Vulkan supports FA too. Level playing field.
hyperparasitism@reddit
To be fair, most people will have flash attention on. If CUDA outperforms Vulkan with flash attention on, then Vulkan still has some catching up to do
SplurtingInYourHands@reddit
Yeah exactly lol. You can import this and all that but will you be able to figure out how to get it to gen your stable diffusion images or run your local LLMs? Sounds like a paperweight without the proper community support
SGC-UNIT-555@reddit
Based on rumours that Deepseek abandoned development on this hardware due to issues with the software stack it seems it needs a while to mature.
JFHermes@reddit
They abandoned training deepseek models on some sort of chip - I doubt it was this one tbh. Inference should be fine. By fine I mean, from a hardware perspective the card will probably hold up. Training requires a lot of power going into the card over a long period of time. I assume this is what the problem is with training epochs that last for a number of months
crantob@reddit
The report was that they had troubles getting the models to converge, not that they ran out of power.
Cergorach@reddit
This sounds and seems similarly to all the Raspberry Pi clones before supply ran out (during the pandemic), sh!t support out of the gates, assumptions of better support down the line, which never materialized... Honestly, you're better off buying a 128GB Framework desktop for around the same price. AMD support isn't all that great either, but I suppose better then this...
Charl1eBr0wn@reddit
Difference being that the incentive to get this working, both for the company as for the country, are massively higher than for a BananaPi...
Cergorach@reddit
You would think that, but look at some of the Chinese projects out there, there are entire ghost cities where one would expect that China/company is massively incentivized to make it work, but no.
v00d00_@reddit
“Ghost cities” are literally just a myth from 15 years ago that propagated bc us westerners apparently can’t fathom economic planning. Read or watch any piece on them from that time, then look up the city’s name and you’ll see that it is in fact populated and still growing now.
Charl1eBr0wn@reddit
No ghost city is as important as getting their hands on domestic cutting edge chips to finally be free of the western chokehold. They're even intent on going to war for it (Taiwan).
Apprehensive-Mark241@reddit
Is there any way to get more than 128 gb into the framework?
Cergorach@reddit
Nope, but in that case you should be looking at a Mac Studio M3 Ultra (256GB/512GB).
No_Efficiency_1144@reddit
Rasberry pi clone supply ran out? Is that why they are no longer spam advertised everywhere?
Pvt_Twinkietoes@reddit
Raspberry pi prices shot up so much. You're better off buying mini pcs for small projects, unless you need the small form factors.
Eremita_Urbano_1655@reddit
People should buy a raspberry pi because of its GPIO not because it was a cheap desktop computer.
Worldly_Striker@reddit
Definitely not true. People bought them because they were a cheap computer. That was literally their whole thing. So every child could have a computer to tinker with.
I've owned raspberry pis for over a decade now.
Hell the cheapest boards don't even come with gpio pins because they know they'll be used for embedded devices or running pihole on.
Pvt_Twinkietoes@reddit
Fair. Lots of folks are just running pihole on them
Orolol@reddit
But this is a Huawei GPU, it doesn't come from a vaporware company.
DistanceSolar1449@reddit
Also these may very well be the same GPUs that Deepseek stopped using lol
demon_itizer@reddit
Maybe it didn't work well for deepseek guys (training) but will work well for us (inference only). Llama cpp works on vulkan and that works alright. I have an AMD W6800 32G that I got for 450$~ in local currency and it works great for inference; not so great for training or even running comfyui for that matter
fallingdowndizzyvr@reddit
No. That's fake news.
https://x.com/theinformation/status/1961417030436880773
emprahsFury@reddit
That has nothing to do with the purported difficulty training on Huawei Ascend's which allegedly broke R2's timeline and caused Deepseek to switch back to Nvidia. And if we were to really think about it- DS wouldnt be switching to Huawei in August 2025, if they hadn't abandoned Huawei in in May 2025.
RuthlessCriticismAll@reddit
In your world Deepseek switched to Huawei in like April or whatever then abandoned it in May and then switched back in August. This is obviously false.
BowlCutKing@reddit
Careful, talking so confidently.
https://arstechnica.com/ai/2025/08/deepseek-delays-next-ai-model-due-to-poor-performance-of-chinese-made-chips/
RuthlessCriticismAll@reddit
You think they switched back and forth every few months. That is completely idiotic. Try to use your head just a little.
BowlCutKing@reddit
You think they didn't try? The hw and sw was too shit, simple.
Don't make me laugh. Smic 7nm duv vs 4nm tsmc blackwell, 3nm tsmc rubin. Ask intel how much fab nodes matter.
Thanks for answering my question, on semiconductors you are an idiot.
fallingdowndizzyvr@reddit
That has nothing to do with what the poster I responded to said, "Deepseek abandoned development on this hardware due to issues".
They have clearly not abandoned it as the report from yesterday shows.
Awkward-Candle-4977@reddit
They ditch it for training.
Multiple gpu over lan thing is very difficult thing
Candid_Highlight_116@reddit
I looked it up a while ago and it seemed this line of chips were some cascaded clustered versions of surveillance camera NPUs, and there were pieces of information hinting that performance don't scale well for large grained tasks like huge LLMs. That's likely why this isn't destroying everything else. "Software" is perhaps one way to sugarcoat it when FLOPS is there without good real world performance figures
shing3232@reddit
They delay it not abandoned it. I heard rumor regarding training smaller R2 model along side training the big one.
zchen27@reddit
I think this is the most important question for buying non-Nvidia hardware nowadays. Nvidia's key to monopoly isn't just chip design, it's their power over the vast majority of the ecosystem.
Doesn't matter how powerful the hardware is if nobody bothered to write a half-good driver for it.
Massive-Question-550@reddit
Honestly probably why and had made such headway now as their software support and compatibility with cuda keeps getting better and better.
Dihedralman@reddit
Honestly they haven't made enough. With the exponential valuations NVidia has been getting, AMD should have invested more into thr software instead of stock buybacks. It likely would have had better returns immediatley.
Ilovekittens345@reddit
About damn time. AMD has always had absolutely horrible software for controlling your graphics settings and their drives at time have been dog shit compared to how Nvidia gives their software and drivers so much priority.
I am glad AMD is finally starting to do things differently. They do support open source much better then Nvidia so when it comes to running local models they could if they wanted to give Nvidia some competition ...
IrisColt@reddit
Y-yes, heh.
AttitudeImportant585@reddit
eh, its evident how big of a gap there is between amd and nvidia/apple chips in terms of community engagement and support. its been a while since i came across any issues/pr for amd chips
iboughtarock@reddit
Tinygrad is trying to solve this exact problem.
am0x@reddit
Exactly. Even huang says they are more of a software company than a hardware one.
It’s the code that makes these things work so well.
kmouratidis@reddit
Yes, but if you can find me 1 happy Nvidia user who uses more than two different-generation GPUs, I'll show you a liar.
ROOFisonFIRE_usa@reddit
Say it ain't so. I was hoping I wouldnt have issues pairing my 3090's with something newer when I had the funds.
michaelsoft__binbows@reddit
No idea what that guy is on about
Ilovekittens345@reddit
I have never had any issues with my nvidia graphics card. The latest one I got was a 3080 ti. So far in my life I have had 12 cards starting with a Gforce 2. I have also had 2 AMD cards, and their hardware is fine but I always have issues with their drivers and the software that controls the settings. With nvidia, if a new driver has issues you rollback then wait for the next one, they get released fairly often. With AMD ... i don't even want to talk about it.
This has always been the main reason people pick Nvidia over AMD. All the extra tools and software you get access to with Nvidia even if the hardware between the two is of the same speed and cost the same. And the stability and support of their drivers.
BoeJonDaker@reddit
Maybe just talking about AI. I used Fermi, Kepler and Pascal for 3D rendering and they worked fine together.
a_beautiful_rhind@reddit
I used 3090/2080ti/P40 before. Obviously they don't support the same features. Maybe the complaint is in regards to that?
kmouratidis@reddit
Yes, and also to inconsistencies (not necessarily incompatibilities), and of course third-party frameworks and their CUDA version support, and feature support matrices.
But maybe ~90% of people here are only doing inference running llamacpp-based stuff on consumer hardware, so they never get to enjoy the "best" parts of Nvidia's stack. The moment you step away from that and start wanting to experiment with more advanced stuff, you're going to discover the joy. A good part of it is due to the frameworks though.
kmouratidis@reddit
If you're only doing inference on some llamacpp-based platform (and maybe tabby/exl) it should be fine. If you're using GPUs with the same size, you might be fine with a few extra stuff too. If you don't have very old and very new stuff, you might also be fine.
If you want to train on a 5090, 3090, and P40 that you have in the same PC, good luck.
Masterofironfist@reddit
If you knew what you really have it wouldn't be a problem, look Nvidia P40 (pascal do not support async computing like other dx 12 GPUs due to flaw in it's architecture and need to emulate it for dx12 games, same for Maxwell based GPUs.) If you had atleast turing based GPU instead of that P40 I believe you could get it working on all 4 cards.
ROOFisonFIRE_usa@reddit
I see. This was what I was worried about. I figured unbatched non tensor parallel should work, but it doesn't surprise me that training / Tensor parallel aren't a cake walk.
-
Thats a bummer. Appreciate the heads up!
mrracerhacker@reddit
Hm i run quadro m2000m on my laptop, got 2 hbm2 with carrier cards v100 and my 3070ti, but happy mjah support kinda hit and miss esp on older cards but makes sense since quadro is from 2013, sxm2 got okay support but also aging which makes sense
codsworth_2015@reddit
Had some some 128mb nvidia gpu when I was a kid, then a 550ti, GTX 960, 1070, 3070 ti and now a 5090. I know nvidia is constantly finding creative ways to screw consumers, but they still make the best hardware. When I brought the 1070, I compared it with the AMD 390, 3070 ti I compared with the 6800 but there was no stock and 3070 ti benched better overall. There is no competition for a 5090.
dibu28@reddit
Had old Nvidia GeForce-something then ATI(radeon) then 670(this was a monster), 1070, 2060 12gb(still using it), 3070 and now 5070.
troughtspace@reddit
Born 78, got first pc -86 i remember 3mb cards 🔥
dibu28@reddit
My first pc was with Pentium 1 75Mhz and ATI (Rage or Mach do not remember exactly.) Don't know if it was accelerating something or not in Windows 95. Before it I've had ZX Spectrum.
kmouratidis@reddit
Similar here, and GT630 was the first GPU I tried using for neural network training, 1080 (trained a few here), 3070 (ti?), 4070 ti, 4x3090 server, 5070 ti, 5060 ti.
But every system either had only 1 GPU or generation of GPUs, or wasn't used for something more advanced than llamacpp-based inference.
No_Efficiency_1144@reddit
Is fine with Nvidia Container Toolkit
pmv143@reddit
Yeah, this is the crux of it. Nvidia’s real advantage isn’t just the silicon, it’s CUDA + the software ecosystem that makes the hardware actually usable. Without good runtimes/drivers, even the most powerful GPU ends up stranded compute.
gpt872323@reddit
There is misinformation as well. Nvidia is go to for training because you need as much horse power you want out of it. For inference amd has decent support now. If you no budget restriction that is different league all together which are enterprises. For avg consumer you can get decent speed with amd or older nvidia.
TheRealGentlefox@reddit
I have had 100% negative experiences with the software for my Chinese hardware, so we'll see if they buck the trend...
DevopsIGuess@reddit
I think the power of free market and open source will solve this fast. We’ll see
Ok_Run_101@reddit
The Chinese manufacturers just have to provide devs with a compatibility layer with CUDA or something right? It's not easy but if they can mass produce GPUs then it shouldn't be rocket science for them.
Or am I missing something?
zchen27@reddit
It's not that simple. Emulating another software and in this case, hardware API is a tricky business (see Apple Rosetta and Steam Proton)
Things don't always work, and when they do they are not guaranteed to work well.
Ok_Run_101@reddit
Thanks, got it. Windows/Linux compatibility is a nightmare so I understand the pain.
It sucks that the AI GPU world is being cornered into a monopoly. Hopefully some innovations around CUDA compatibility come up in the future, in the software layer.
Coders_REACT_To_JS@reddit
Considering just how much money it could save/make, I bet this situation improves (or at least people try to improve it).
Ok_Run_101@reddit
knock on wood!
ChloeNow@reddit
100%
ATI (For youngin's, AMD was split into AMD/ATI back in the day with ATI making the cards) was a joke back in the day until they started getting their driver game on point.
They WERE always behind AF. Then their hardware started catching up and actually being faster in a lot of cases... it was just a matter of getting it to actually work. Meanwhile NVidia drivers just fucking worked, every time, flawless.
Now, though? There's a lot of complex shit going on that wasn't back in the day. Cards are more complex and with that comes more potential for mess-ups, so as AMD moved towards better drivers, NVidia was getting further from being able to make those 100% stable drivers.
The playing field almost leveled out until the AI race kicked up.
Funny enough... we've gone right back. AMD is faster than NVidia in a lot of cases when it comes to AI... but good luck getting anything to run on it when all the major fundamental AI tools are heavily geared towards NVidia CUDA specifically
bitspace@reddit
"Does it CUDA?"
FinBenton@reddit
Idk what the support is right now or is this even a real GPU you can buy but considering 90% of the local models I use are chinese, it goes without a doubt there will be support.
Initial-Swan6385@reddit
use claude code to program the software xd
Desperate_Echidna350@reddit
could it still be a viable option if you just want it for AI and maybe some light gaming?
keepthepace@reddit
Qwen is probably first in line, they already had CUDA-bypassing int8 inference IIRC.
All the Chinese labs are going to be on it.
Pvt_Twinkietoes@reddit
R2 has been delayed because they want to train on Chinese chips right? Might be these.
Minato-Mirai-21@reddit
They have support for PyTorch, called torch-npu.
No_Efficiency_1144@reddit
Wow can you import?
What flops though
6uoz7fyybcec6h35@reddit
280 TOPS INT8 / 140 TFLOPS FP16
LPDDR4X 96GB / 48GB VRAM
No_Hornet_1227@reddit
So its really shit
6uoz7fyybcec6h35@reddit
LPDD4X got bandwidth limit and for LocalLLM VRAM is much more essential.
hardcore_aebanise@reddit
why? that's a pretty big number considering the card's power draw.
No_Hornet_1227@reddit
The card does what 200-300 tops? A 5090 does 3300. If the card used like 50W I would agree but it doesnt
LuciusCentauri@reddit
Its already on ebay for $4000. Crazy how just importing doubled the price (not even sure if tax included)
loyalekoinu88@reddit
Alibaba it's around $1240 with sale. It's like a 3rd of that imported price.
DistanceSolar1449@reddit
Here are the specs that everyone is interested in:
Huawei Atlas 300V Pro 48GB
https://e.huawei.com/cn/products/computing/ascend/atlas-300v-pro
48GB LPDDR4x at 204.8GB/s
140 TOPS INT8
70 TFLOPS FP16
Huawei Atlas 300i Duo 96GB
https://e.huawei.com/cn/products/computing/ascend/atlas-300i-duo
96GB or 48GB LPDDR4X at 408GB/s, supports ECC
280 TOPS INT8
140 TFLOPS FP16
PCIe Gen4.0 ×16 interface
Single PCIe slot (!)
150W
For reference the 3090 does 284 TOPS INT8, 71 TFLOPS FP16, and 936 GB/s memory bandwidth
Linux drivers:
https://support.huawei.com/enterprise/en/doc/EDOC1100349469/2645a51f/direct-installation-using-a-binary-file
https://support.huawei.com/enterprise/en/ascend-computing/ascend-hdk-pid-252764743/software
vLLM support seems slow, llama.cpp support seems better.
helgur@reddit
Under half the memory bandwidth of the 3090, I wonder how this GPU stacks up against metal GPU's on inference. Going to be really interesting seeing tests coming out with these.
Miserable-Dare5090@reddit
M Ultra chips have twice the bandwidth at ~800
Immediate-Alfalfa409@reddit
Yeah exactly….the nos. only tell part of the story. Until there are real-world tests…it’s hard to know if the extra VRAM actually makes up for the bandwidth gap. I’m just waiting to see how it holds up in practice.
Front_Eagle739@reddit
Yeah should be interesting, its in the ballpark of an m4 max i think but 5x the f16 tops so should be better at the prompt processing which is the real weakness for most use cases. If the drivers and support are any good I coukd see myself grabbing a couple of these.
helgur@reddit
Thats a good point. Six of these cards is still half as cheap as a M3 Apple Mac Studio with 512GB of unified RAM. The Studio was before this the budget go to for a lot of "VRAM" (in quotes because it's really unified RAM on the Mac) for a reasonabl(er) price. If the drivers for these are solid, it's really going to be a excellent contender for a lot of different usages.
orkutmuratyilmaz@reddit
I am a registered Apple enemy and I'd like to add that Mac Studio has some other hardware components, such as CPU, RAM, Store Volume etc. We need to calculate the price difference more accurately.
Front_Eagle739@reddit
Yeah if the assumptions above are accurate (which i doubt but hey) you just need any old cheap server with a bunch of pcie lanes and a riser to make a reasonably potent large vram llm box. Half the inference speed and 5x the prompt processing speed as an m3 ultra for 7-8 grand ish would be much more useable for a lot of tasks than a full specced m3 ultra. And i say this as a guy with a 128gb m3 max macbook that i use all the time for local inference and think is great.
vancity-boi-in-tdot@reddit
And the post title hilariously compared this to rtx pro 6000...
Band Width :1.6 Tbit/s Bus Width :512-bit Memory Technology :GDDR7 SDRAM
LOL
And why not compare this to a 5090 instead of a 3090 which was released 5 years ago? Bandwidth :1.7 Tbit/s
I give Huawei an A for effort. I give this post title and any Blackwell comparison an F.
sassydodo@reddit
Localllama is full of bitter haters man
DistanceSolar1449@reddit
Why are you comparing it to the 5090? This GPU was released in 2022.
https://support.huawei.com/enterprise/en/doc/EDOC1100285916?idPath=23710424%7C251366513%7C22892968%7C252309139%7C252823107
https://e.huawei.com/cn/products/computing/ascend/atlas-300i-duo
BetterEveryLeapYear@reddit
Then the post is even worse
vancity-boi-in-tdot@reddit
Yeah like the title "finally"
DistanceSolar1449@reddit
It's only hitting the used market now, that's why. You can't buy it new without an enterprise service contract.
jonydevidson@reddit
On current macs the biggest problem is when the context gets big, generation becomes stupidly slow.
redditorialy_retard@reddit
How hard is it to run one of these with a 3090?
DistanceSolar1449@reddit
Basically impossible. You can run one or more of them but don't try to mix them with CUDA
akierum@reddit
why not for ollama would not work why?
DistanceSolar1449@reddit
Doesn’t have the correct backend installed
akierum@reddit
Then what about ramalama?
redditorialy_retard@reddit
welp 2x 3090s it is
o5mfiHTNsH748KVq@reddit
204.8GB/s ain't it
jrherita@reddit
I'm really curious how they achieved that FLOPS level with only 150W TDP. The 3090 is 350W, and granted it's on an older Samsung 8nm process, I don't think Huawei has access to a much better process than that for manufacturing.
I think the 3090 still draws about 250W-300W to achieve it's full TOPS rating. Some of that power is probably the faster memory bandwidth, but..
Django_McFly@reddit
Those specs don't seem bad. It's wild that the VRAM is a generation behind what you see in AI Max 395 and Sparks but they got it running noticeably faster.
Suppe2000@reddit
I only got a strix 395 max. It is quite nice, but the driver support is hell. But the Huawei card cost the same as the miniPC and has doubled bandwidth.
jonahatw@reddit
It's a good price, but know you're buying surveillance with it too. I owned a Huawei phone during the Hong Kong democracy protests and when I searched on the phone for where to donate, it gave me a bunch of pro- PRC links. When I searched on my desktop PC the results much better matched my query. I now expect similar manipulation in anything a Huawei outputs.
Important_Concept967@reddit
Cant be worse ten google lol
pedroserapio@reddit
From your results and my searches on the Chinese TaoBao, I can find 3090, second hand, 24GB very similar price to a new Huawei Atlas, around 6000rmb. This gives me mixed feelings.
Achrus@reddit
Is that 150W TDP correct?? That’s impressively low for those specs!
DistanceSolar1449@reddit
It's a single slot PCIe card (pic) so yeah I doubt they're stuffing 300W into it, that'd be impossible to cool.
terminoid_@reddit
oof, that bandwidth...happy to see any kind of competition tho
MargretTatchersParty@reddit
You're going to pay a really high tarrif to get that in the country.
loyalekoinu88@reddit
Probably. I’m not advocating for buying it btw. Just showing that it is available for purchase and that it’s 1/3 the cost of eBay. Tariff costs also depend on when you purchase it since that number keeps going up and down these days.
Glittering-Call8746@reddit
Link ?
Puzzleheaded-Suit-67@reddit
you need to pay the import when its from alibaba
loyalekoinu88@reddit
I don’t normally buy but don’t they usually show you the cost in the checkout?
Puzzleheaded-Suit-67@reddit
it depends, normally in a ecommerce site you buy from a store, they included in the price because they have to pay for it since they are the importer. In alibaba if you buy directly from a manufacturer/supplier and you are the importer, they dont pay the import, but its not always the case.
kinja88@reddit
Alibaba link?
loyalekoinu88@reddit
https://www.alibaba.com/product-detail/New-Huaweis-Atlas-300I-DUO-96G_1601450236740.html
hmmqzaz@reddit
You think it’s anything like plug and play on a normal PC?
Girafferage@reddit
Lol definitely not. Getting drivers for it is going to be a massive nightmare.
layer4down@reddit
Early days.
LeBoulu777@reddit
I'm in Canada and ordering it from Alibaba is $2050 cdn including shipping. 🙂✌️. God Bless Canada ! 🥳
Spectrum1523@reddit
only $2k CDN for a card with worse performance than a 3090, sweet
Shadowarchcat@reddit
Brain dead comment. Performance of 3090.. yeah but memory of A100 40.000€ card
Spectrum1523@reddit
I can buy 96gb of ddr for like $100 so what
Shadowarchcat@reddit
Next brain dead comment. Keep going bro I need to see how dumb you can get.
Yes you can buy ram. But then ur on CPU not gpu. If you want to run llms on CPU u won’t need that 3090 too. And speed is no question anymore whatsoever at this point.
Spectrum1523@reddit
I guess I don't understand the use case for a card that inferences so slowly with a bunch of slow ram? It is more power efficient than a pc, so that's good
NoFunction4531@reddit
spectrum1523...yeah your nickname checks out .
Minute_Effect1807@reddit
96 GB VRAM. You can get a mac with this much memory, but it's going to be more expensive.
Classic-Sky5634@reddit
Please let us know how it performs. I would like to know you could run Ollama or ComfyUi
Amgadoz@reddit
Please do a lot of benchmarks and share the results!
Enjoy!
Necromancius@reddit
Should have bought a used 3090 instead...
Newtonip@reddit
There may be duty on top of that.
Yellow_The_White@reddit
Unrelated thought I wonder how much I could get a second-hand narco sub for.
sersoniko@reddit
There are services where you pay more for shipping but they re route or re package the item such that you avoid importing fees
Weary-Willow5126@reddit
Don't they open the packaging in US to check if it's actually what's claimed?
loonygecko@reddit
The thing is some dude at customs does not know the going actual cost of an electronic component so you can label at as a very cheap chip and say it cost $10 and that guy is not going to know, nor have time to check. He'll see it is an electronic component as per the paperwork and he will pass it through.
Weary-Willow5126@reddit
Makes sense
I asked cause that was also a thing (kinda still is tbh) here in Brazil but the success rate is awful nowadays sadly, everything will get taxed
Back in the days we could literally just ask the seller to write it's under 50$ in the package and it would never get taxed lol
But they started opening everything and checking the prices online... Now you have to be very lucky, like 1 in 10 chances it doesn't get taxed 😩
loonygecko@reddit
Sounds like Brazil has increased enforcement greatly, that sucks for you. However IME the USA customs is mostly just looking for drugs or illegal items, they are not taking the time to really check on pricing issues. Until recently, we also had an $800 exemption for tariffs so they are probably now super overworked trying to handle the tariffs on all those small purchases now that there is no exemption at all.
landon912@reddit
Yea, that’s called crime.
It doesn’t matter where you ship from. It’s the country of origin which matters
OfficialHashPanda@reddit
Nah, crime is called the president.
luciferxf@reddit
Probably because of tariffs. If they will even ship it. The USA is in the middle of a shipping embargo. More than 25 countries wont ship here. China is one of them.
loonygecko@reddit
That's the middle man for you, they buy it, double the price and then resell it.
markole@reddit
This is a common thing for fair amount of us in Europe. So happy that these tariffs incentivized the development of cheaper and more capable domestic GPUs though./s
cyrixlord@reddit
i think the chinese tariff is over 20% so....
agentzappo@reddit
-55% Tarriffs will continue to keep the price high by the time it’s imported into the US
meshreplacer@reddit
4K I will buy Apple 128gb M4 Apple Certified Unix workstation aka Max studios all day long. Free computer and OS included with every purchase.
farnoud@reddit
Thank Trump for that
HillTower160@reddit
So much winning.
_Sneaky_Bastard_@reddit
Glowing-Strelok-1986@reddit
Not really when you consider eBay's fees, international and domestic shipping, taxes and the fact that the seller wants enough profit to be worth the risk of being scammed.
rexum98@reddit
There are many chinese forwarding services.
sourceholder@reddit
Oh how the tables have turned...
FaceDeer@reddit
The irony will be lovely as American companies try to smuggle mass quantities of Chinese GPUs into the country.
Recoil42@reddit
Already starting to happen with cars.
lawldoge@reddit
Must be a local thing. Drive 600-1000 miles/week and have yet to see a Chinese branded car on the road.
Recoil42@reddit
Do you live in the United States? Chinese cars are effectively banned in the United States.
There's plenty everywhere else in the world though.
Barafu@reddit
Mewanwhile me in Russia still thinking how to run LLM on a bear.
loonygecko@reddit
I thought Russians can order from Alibaba and China.
Barafu@reddit
No, only AliExpress, and then not everything. Besides, it comes with explicitly no warranties: if I pay for a GPU, but receive a cinder block, it is totally my problem.
loonygecko@reddit
IDK about Russia but for the USA, they have the first return you want to make as free. Also companies on there get reviews on their products and are motivated to keep their reviews high, check reviews before buying. It is similar to Paypal. Overall, I have good experiences on Aliexpress. However, ordering anything very expensive on the internet is always going to be a bit worrying.
arotaxOG@reddit
Strap bear to a typewriter comrade, whenever a stinki westoid prompts a message whatever the bear answers is right because no one questions angry vodka bear
Moist-Topic-370@reddit
LMAO, you have to be delusional.
FaceDeer@reddit
You're right, simpler to just move their operations out of America entirely.
Moist-Topic-370@reddit
Yes, of course. OpenAI, Google, etc. are going to leave the United States to use Chinese GPUs. Your sir or ma’am have such a bright future in geopolitical strategy.
loonygecko@reddit
For now they are ahead but the issue is that China is gaining ground rapidly and the USA continually fails at making chips in America. China is able to make products at a much lower cost point so once they catch up in technology, that will be a huge problem for the USA. Even before that, many companies do not need the fastest chips and if China can offer decent mid speed chips at a much lower price point, that will take a lot of business from American companies. Also the limits on global selling put on USA companies is shrinking their market share globally for the fastest chips, leaving the market wide open for the first alternative that can make a good enough product.
It's like a horse race where the horse that used to be way in the back has caught up to the main pack and continues to gain on the frontrunner and the frontrunner is starting to look a bit tired. Everyone is watching the closing gap between them and wondering if anything else will happen to change the situation, otherwise the frontrunner will change in the coming years.
FaceDeer@reddit
You think those are the only companies using AI?
3000LettersOfMarque@reddit
Hauwei might be difficult to get in the US given in the first term they were banned both base stations, network equipment and most phones at the time from being imported for use in cellular networks for the purposes of national security
Given AI is different yet similar the door might become shut again for similar reasons or just straight up corruption
Siddharta95@reddit
National security = Apple earnings
apodicity@reddit
I dare say it's time to make some apple pie.
Swimming_Drink_6890@reddit
Don't you just love how car theft rings can swipe cars and ship them overseas in a day and nobody can do anything, but try to import a car (or GPU) illegally and the hammer of God comes down on you. Makes me think they could stop the thefts if they wanted, but don't.
crantob@reddit
Smugglers avoid compliance with regulations. Regulations which ought to be better named what they are: "government interference".
When someone says 'that needs to be regulated' I now ask, "To exactly whom do you want to give the plenary power to interfere with voluntary transactions?
apodicity@reddit
And yeah, they are government interference, obviously, but that doesn't make the interference automagically bad, unfair, undesirable, etc. If there were no government, people would eventually agree on an authority which does more or less the same thing. the devil is in the details; you can't make some kinda a priori judgement about stuff like this (unless u wanna be a crank).
Regulatory failures exist. So do market failures.
apodicity@reddit
Don't say "voluntary transactions" in this context unless u want people to look at u askance. Just say "transactions". An "involuntary transaction" is referred to as a "forced sale".
"A forced sale is an involuntary transaction in which the sale is based upon legal and not economic factors, such as a decree, execution, or something different than mere inability to maintain the property. If the sale is made for purely economic reasons, it is considered voluntary."
See what I mean?
https://www.law.cornell.edu/wex/forced_sale
Imaginary-Hour3190@reddit
Uh oh, he is starting to wake up, SHUT IT DOWN
MelodicRecognition7@reddit
now think why drugs are illegal and what would change if for example coke was legal. Except few govt officials losing a huge gesheft from smuggling it of course.
PsychologicalOne752@reddit
Every illegal export is now sponsored by Bitcoin and the POTUS is more invested in Bitcoin than in the US dollar so why would he want to stop illegal exports?
Bakoro@reddit
They can stop the thefts, but they could stop the illegal international exports if they wanted to, but don't.
No_Hornet_1227@reddit
They could ban gpu exports to singapore and hong kong for their part in helping china evade sanctions and tell jensen if nvidia keep obviously closing their eyes at obvious sanctions avoiding to sell gpus, jensen himself will go to prison for many years... but they wont because in america the rule of law doesnt exist when money is involved
Suitable-Bar3654@reddit
Why stop it if it makes money? Besides, this isn't theft—it's a purchase.
Swimming_Drink_6890@reddit
Im not sure what you mean.
hackeristi@reddit
Thay makes you wonder if the nvidia had intel that they were going to do this and told the US to ban them haha
3000LettersOfMarque@reddit
It was way back in May of 2019. while checking the date I looked up if this card would be included. It's already included as the whole company is under trade sanctions for its connections to the Chinese military.
For US citizens or entities it's basically treason to do any business buying this card even from a 3rd party, for non US entities it complicates any future business with the US or entities that conduct business with the US
AnduriII@reddit
Luckily i'not in the US🤗
brutal_cat_slayer@reddit
At least for the US market, I think importing these is illegal.
NoForm5443@reddit
Which laws and from which country do you think you would be breaking?
MedicalScore3474@reddit
https://www.huaweicentral.com/us-imposing-stricter-rules-on-huawei-ai-chips-usage-worldwide/
US laws, and if they're as strict as they were with Huawei Ascend processors, you won't even be able to use them anywhere in the world if you're a US citizen.
PraxicalExperience@reddit
US laws block the export of certain chips to China, and block certain government users from using Huawei products, but as far as I'm aware there's nothing in place that would stop Joe Blow from importing as many of these as he'd like.
a_beautiful_rhind@reddit
Sounds difficult to enforce. I know their products can't be used in any government/infrastructure in the US.
If you try to import one, it could get seized by customs and that would be that.
Yellow_The_White@reddit
Anyone big enough to matter the scale would be too big to hide. It would probably prevent Amazon from setting up a datacenter in Canada or something.
robbievega@reddit
i think it's the other way around? selling them to China was forbidden. those tariffs aint gonna help though
Ansible32@reddit
Huawei is embargoed. For no apparent reason, though it's implied US intelligence services know for a fact Huawei puts backdoors in their hardware, there have been no explanations of exactly how.
chinese__investor@reddit
JustFinishedBSG@reddit
Huawei is under sanctions
loyalekoinu88@reddit
Huawei's New Atlas 300i Duo 96g Deepseek Ai Gpu Server Inference Card With Fan Cooler Video Acceleration Graphic Card Made China - Buy Atlas 300i Duo 96g huaweis Gpu server Gpu hua Wei Gpu Server hua Wei Gpu new Huawei Gpu Card fan-cooled Graphics Card workstation Graphics Card Product on Alibaba.com
Antique_Bit_1049@reddit
GDDR4?
anotheruser323@reddit
LPDDR4x
From their official website:
LPDDR4X 96GB or 48GB, total bandwidth 408GB/s Support for ECC
boissez@reddit
Hardly better than Strix Halo then. (96GB at around 273 gb/s)
StevenSamAI@reddit
I'd say 408gb/s is significantly better.
Puzzlesolver01@reddit
It's a dual card so 2 times 204GB/sec. Lets hope the layers split well accross the 2 times 48GB and the interconnect getween the 2 GPU's is fast enough. Benchmark 1st before bragging, not the other way around please.
vancity-boi-in-tdot@reddit
Compared to 1.7k GB/sof a rtx pro 6000 (and without the CUDA cores) which the post title did? Hmm
Caffdy@reddit
be 49% faster.
call it "hardly better". smh
michaelsoft__binbows@reddit
Damn what is the bit width on this thing!
xugik1@reddit
Should be 768 bits. (768 bits x 4260 Mhz / 8 = 408 GB/s)
MelodicRecognition7@reddit
no, see specs above, this card has 200 GB/s DDR4 speed (8 channel x 3200 MHz?), 400 GB is the Frankenstein card with two separate graphics chips having separate memory chips.
MelodicRecognition7@reddit
no, see above, this card has 200 GB/s DDR4 speed (8 channel x 3200 MHz?), 400 GB is the Frankenstein card with two separate graphics chips having separate memory chips.
Dgamax@reddit
2x 512bits I suppose?
Dgamax@reddit
LPDDR4x ? Why 😑this is sooo slow
UsernameAvaylable@reddit
Cause large and fast memory costs money.
BlueSwordM@reddit
LPDDR4X has a massive surplus of production because of older phone flagships that used it and some older phones using it.
Still, bandwidth is quite decent at 3733-4266MT/s.
Due_Tank_6976@reddit
You get a lot of it for cheap, that's why.
loyalekoinu88@reddit
Looks to be the case
YouDontSeemRight@reddit
Bandwidth?
Legitimate-Novel4734@reddit
No.
krste1point0@reddit
"Memory options include 48 GB or 96 GB LPDDR4X total on the Duo, with an aggregate memory bandwidth listed around 408 GB/s (effectively 2 × ~204 GB/s across the two processors). "
Legitimate-Novel4734@reddit
8 channel LPDDR4X only hits 68.3GB/s total rate at 4266MT/s you best check your math and not buy into china's shit too much.
krste1point0@reddit
Huawei uses a much wider aggregate memory interface per chip, effectively 384 bits (24 lanes × 16‑bit equivalent), which at 4266 Mb/s per pin computes to 204.8 GB/s per chip or 408GB/s for this GPU since it has 2 chips.
Legitimate-Novel4734@reddit
Huh, Huawei went all out on bus width I guess it's impressive on paper. But it’s still useless for most people. You’re stuck with custom forked chinese builds of TensorFlow or PyTorch, no CUDA support, and worse than AMD driver support. That’s a giant question mark a year down the line.
Mediocre-Method782@reddit
You need to learn to be more embarrassed about posting cope online.
Legitimate-Novel4734@reddit
You should go reee here: https://www.reddit.com/r/LocalLLaMA/comments/1n4wo0y/the_huawei_gpu_is_not_equivalent_to_an_rtx_6000/
Mediocre-Method782@reddit
You should stop larping about the childish drama of gods larping in the heavens.
Legitimate-Novel4734@reddit
I don't have an instagram and don't work for a corporation (well not a typical one anyway) so I still have no idea what you are talking about.
Legitimate-Novel4734@reddit
wut? are you drunk?
Mediocre-Method782@reddit
Corporations, you simple-mided Instagram fuckwit
Legitimate-Novel4734@reddit
I don't care about manufacturing, I care about software support and capability. I do kind of like that the drivers and firmware are on github, nvidia can suck it on that. However it's still going to suffer unless full architecture details are released so the drivers can be made correctly, and I mean FULL details. We haven't really had lpddr4x on much of anything since 2021. It's a 65W card, good luck, your training will get there in the end, sure....in about 5 months.
The low power, outdated vram, and huawei seemingly already staging to rely on open source development means a few things.
The driver support is gonna be shit. (See previous)
This card that is supposed to break the monopoly? Isn't even on AMD or Nvidia's radar. The DGX Spark is going to kill any part of this card that comes close to looking good when it releases. Guarantee you china's big boys aren't even looking at this card, this is an enthusiast compute card at BEST for people who just can't get the real hardware imported.
CreativeDimension@reddit
BandNarrow?
NickCanCode@reddit
400/s
shaq992@reddit
LPDDR4X
https://e.huawei.com/cn/products/computing/ascend/atlas-300i-duo
ttkciar@reddit
Interesting .. compute performance about halfway between an MI60 and MI100, but at half of the bandwidth, but oodles more memory.
Seems like it might be a good fit for MoE?
Thanks for the link!
shaq992@reddit
Found an english version
https://support.huawei.com/enterprise/en/doc/EDOC1100285916/426cffd9/about-this-document
gK_aMb@reddit
LPDDR4x
Rich_Repeat_22@reddit
hmm so a 10% faster AMD 395 with almost double the bandwidth 🤔
firewire_9000@reddit
150 W? Looks like a card with small power and a lot of RAM.
sheepyowl@reddit
It's probably made for students who need to run a local AI (like DeepSeek) for research. For these things it will run slow with weak hardware, but it won't run at all if the GPU doesn't have enough RAM.
Apparently AI is big in Chinese university/college.
They also modify Nvidia GPUs for this purpose (they add A LOT of VRAM). You will be able to see it on GN's movie once the copyright strike troll expires.
Swimming_Drink_6890@reddit
Typically cards are under volted when running inference.
Caffdy@reddit
Amen for that, it's about damn time we start to see low-power inference devices
OsakaSeafoodConcrn@reddit
What drivers/etc would you use to get this working with oobabooga/etc?
loyalekoinu88@reddit
No idea. I looked up inference engines and it looks like it might inly be supported in vllm and only certain models.
TheTerrasque@reddit
I guess llama.cpp with this backend might work. Atlas 300I Duo is mentioned as supported.
jrherita@reddit
It has FLOPS, yes...
arglarg@reddit
Single-slot half-height, half-length PCIe card
140 TOPS INT8
70 TFLOPS FP16
Memory specifications
LPDDR4X 48 GB, total bandwidth 204.8 GB/s, supports ECC
AI computing power
CPU computing power
8 core 1.9 GHz
Codec capabilities
Support H.264/H.265 hardware decoding, 128-channel 1080P 30 FPS
(16 channels, 3840*2160, 60 FPS)
Support H.264/H.265 hardware encoding, 24-channel 1080P 30FPS
(3-channel 4K 60FPS)
JPEG decoding capability 4K FPS, encoding capability 4K 192 FPS,
Maximum resolution: 8192*8192
PCIe interface
PCIe x16 Gen4.0
Maximum power consumption
72W
Working environment temperature
0℃~55℃ (32°F~131°F)
Structure size
169.5mm (length) x 18.45mm (width) x 68.9mm (height)
w3bCraw1er@reddit
I don't think it will flop. It may take time but could be a pretty solid competition to NVDA.
No_Efficiency_1144@reddit
Haha flop is how you measure the speed of a graphics card
cnydox@reddit
140 TOPS INT8 70 TFLOPS FP16
shing3232@reddit
280 TOPS INT8
TheAIPU-guy@reddit
This is not a GPU though. It's an NPU.
GreatGatsby00@reddit
Can I order that on Alibaba?
nemuro87@reddit
fucking finally someone has woken up.
Unfortunately it's Huawei and it has history of having some inbuilt hardware spyware.
Hope Intel or AMD wakes up too.
CeFurkan@reddit (OP)
More companies will come too
Due_Investigator3288@reddit
People try to compress these things down to one parameter and are then surprised why their cheap hardware doesn’t perform. Memory is only a small part of the problem. Huawei cards can’t do double and float math. At that point even a raspberry Pi can defeat this card in some tasks. Even for training you still want 32 bits for some tasks. Afaik the huawei cards can’t do that.
Nexter92@reddit
If it's the same performance as RTX 4090 speed with 96GB, what a banger
GreatBigJerk@reddit
It's not. It's considerably slower, doesn't have CUDA, and you are entirely beholden to whatever sketchy drivers they have.
There are YouTubers who have bought other Chinese cards to test them out, and drivers are generally the big problem.
Chinese hardware manufacturers usually only target and test on the hardware/software configs available in China. They mostly use the same stuff, but with weird quirks due to Chinese ownership and modification of a lot of stuff that enters their country. Huawei has their own (Linux based) OS for example.
TheThoccnessMonster@reddit
And power consumption is generally also dog shit.
PlasticAngle@reddit
china is one of a few country that doesn't give a fuck about power consumption because they produce so much that they doesn't care.
8P8OoBz@reddit
*buy from Russia
FyreKZ@reddit
China overwhelmingly produces its own power lol, 9.4 trillion kWh last year. They don't need Russian oil.
cockerspanielhere@reddit
Yeah, that explains why China is in all time highs of carbon imports
FyreKZ@reddit
Yep, they burn a lot of coal, but unlike the US are clearly progressing towards a combination of nuclear and renewables rather than regressing.
I don't particularly like China and don't like defending them, but at least they are genuinely trying with renewables and have a long term plan rather than the US's fascistic flipflopping.
cockerspanielhere@reddit
If they produced they're own power, why would they IMPORT RECORD AMOUNTS OF COAL?
You're mixing reality with morals and wishes
FyreKZ@reddit
Because domestic output rates were reduced and domestic prices increased, so it was cheaper to import.
The reality is that China is reducing its dependency on coal and natural gas in favour of renewables whereas other countries fail.
cockerspanielhere@reddit
Saying they doesn't "need" Russian oil ignores the nuances of energy security and China's large-scale crude oil and LNG imports (China is a significant oil and gas importer for its overall energy needs, though less so for electricity generation). That's as delulu as it gets.
China is a net oil and LNG importer for fuels, not for power. If you think energy equals electricity, you should go back to school. Even for a electricity fanboy, 56% of it is generated with coal.
koeless-dev@reddit
I might be wrong, but I interpret 8P8OoBz's comment as playing on the words "power hungry". Yes we're talking about resources not people, so a bit of a stretch but hm. Just giving what I believe they are referring to.
cockerspanielhere@reddit
You're getting down voted by ignorants. China's energy balance is -400 billion USD per year
apodicity@reddit
Huh? That makes no sense. Power consumption is dictated by market incentives.
chlebseby@reddit
This rule apply to computer equipment or products in general?
I have lot of chineese devices and they seems to have typical power.
sheepyowl@reddit
If it's a product made for global competition it should be fine.
If it's a product made for China, think twice
emprahsFury@reddit
China is energy constrained.
twavisdegwet@reddit
https://fortune.com/2025/08/14/data-centers-china-grid-us-infrastructure/
stoppableDissolution@reddit
Which is the way.
shing3232@reddit
about 150W max
TheThoccnessMonster@reddit
I meant the Huawei data center cards.
LettuceElectronic995@reddit
this is huawei, not some shitty obscure brand.
GreatBigJerk@reddit
Sure, but they're not really known for consumer GPUs. It's like buying an oven made by Apple. It probably would be fine but in no way competitive with industry experts.
hydraulix989@reddit
They once said that about Apple's brand new phone at the time.
GreatBigJerk@reddit
Sure, and I hope Huawei does well. They probably will make a good consumer GPU eventually.
This isn't one due to the memory bandwidth though.
somepotato5@reddit
I mean, we did start buying phones from Apple 2 decades ago, and look what happened.
LettuceElectronic995@reddit
lets see
ChloeNow@reddit
I give it 6 months before the drivers are up to par, a year before it's basically the equivalent of AMD, and a year and a half before it's on-par with NVidia, AI support and all.
I would say longer for AI support, but I see a lot of open-source tools have started loosely supporting (or trying to support) AMD, and once all these apps have modularized the parts of the code dealing with the specific hardware architecture, supporting a new card will be much faster than that initial battle.
pier4r@reddit
what blows my mind, or better blows the AI hype is exactly the software advantage of some products.
For the hype we have on LLMs, it feels like (large) companies could create a user friendly software stack in few months (to a year) and to close the SW gap to nvidia.
CUDA having years of advantage creates a lot of tools and documentation and integrations (i.e. pytorch and what not) that gives nvidia the advantage.
With LLMs (with the LLM hype that is) one in theory should be able to reduce the gap a lot.
And yet the reality is that neither AMD or others (that have even less time spent on the matter than AMD) can close that gap quickly. This while AMD or chinese firms aren't exactly lacking in resources to use LLMs. Hence the LLMs are useful but not yet that powerful.
Pruzter@reddit
lol, if LLMs could recreate something like CUDA we would be living in the golden age of humanity, a post scarcity world. We are nowhere near this point.
LLMs struggle with maintaining contextual awareness for even a medium sized project in a high level programming language like Python or JS. They are great to help write small portions of your program in lower level languages, but the lower level the language, the more complex and layered the interdependencies of the program become. This translates into requiring even more contextual awareness to effectively program. AKA we are a long way off from LLMs being able to recreate something like CUDA without an absurd number of human engineering hours.
wuu73@reddit
ummmm... did you know that they literally solved protein folding? they've invented new medications, but i mean, go ahead and think that, more compute for me lol
Pruzter@reddit
I am like the most pro LLM person there is, I burn millions of tokens a day using these things for programming. It’s how I know exactly where the current leverage and pain points are
pier4r@reddit
I am not saying that, not they do it on their own like AGI/ASI.
Rather that they can help devs so much, that the devs speed up and narrow the gap. But that doesn't happen either. So LLMs are helpful but not that powerful. As you well put, as soon as the code becomes tangled in dependencies, they cannot handle it. Even if it fits their context window.
AnExoticLlama@reddit
I believe they were referring to the LLM hype = using it to fund devs with the purpose of furthering something like Vulkan to match CUDA.
Lissanro@reddit
Current LLMs are helpful, but not quite there yet to help much with low level work like writing drivers drivers or other complex software, let alone hardware.
I work with LLMs daily, and know from experience that even the best models of there in both thinking and non-thinking categories like V3.1 or K2 can do not just silly mistakes, but struggle to notice and overcome them even if noticed. Even worse, when there are many mistakes that form pattern they notice, they more likely to make more mistakes like that than to learn (through in-context learning) to avoid them, and due to likely being overconfident, they often cannot produce good feedback about their own mistakes, so agentic approach cannot solve the problem either, even though it helps to mitigate it to some extent.
The point is, current AI cannot yet allow to easily "reduce the gap" in cases like this; can improve productivity though if used right.
No_Hornet_1227@reddit
Yup my brother works at a top ai company in canada and a ton of companies come see them to install AI at their company ... and basically all the clients are like : we can fire everyone, the ai is gonna do all the work! My bro is like : you guys are so very wrong, the ai we're installing that you want so much isnt even CLOSE to what you guys think it does... we've warned you about it... but you want it anyway so... we're doing it but you'll see.
Then a few weeks/months later, the companies come back and are like, yeah these ai are kinda useless so we had to re-hire all the people we fired... My bro is like no shit, we told you but you wouldnt believe us!
A lot of rich assholes in control have watched the matrix too many times and think this is what AI is right now... Microsoft, google and all the big corporations firing thousands of employees to focus on AI? The same blowback is gonna happen to them.
Sabin_Stargem@reddit
Much as I like AI, they aren't fit for prime time. You would think that people wealthy enough to own a company, would try out AI themselves and learn whether it is fit for purpose.
TheTerrasque@reddit
Yep. When AI became popular I was really looking into providing local AI inference for businesses, but I realized that where the tech actually was and what people thought it was was too far from each other and it would be a catastrophe.
Sabin_Stargem@reddit
Hypothetical: having 2+ different AI models from different families might be able to correct each other, because they see different aspects of their output. That would require much stronger hardware to run multi-core...(multi-mind?) AI.
Hopefully we can see whether this works within a decade or so.
pier4r@reddit
and I am talking mostly about this. Surely AMD (and other) devs can use it productively and thus they can narrow they gap, yet it is not as fantastic as it is sold. That was my point.
TheTerrasque@reddit
What I've noticed is the more technical the code is, the more terrible the LLM is. It's great and very strong when I'm writing something in a new language I'm learning, and it can explain things pretty well.
Getting it to help me debug something in languages I've had years of experience in, and it's pretty useless.
I'm guessing "join hardware and software to replicate cutting edge super complex system" with LLM's will at best be an exercise in frustration.
Ensiferum@reddit
"In a few months to a year"
My man has clearly never worked in enterprise level software development. Whatever you think the complexity of such a project is, multiply it by 5.
pier4r@reddit
I do. That's the point. If the hype would be real, "augmented" dev under pressure by management could narrow the gap in a short time. Maybe a year is short, then two.
The point being, the hype is far away from reality.
I thought it was clear that I was shrinking the needed time by a lot to match the hype and disprove it.
yogthos@reddit
there's work being done here already https://www.tomshardware.com/pc-components/gpus/chinas-moore-threads-polishes-homegrown-cuda-alternative-musa-supports-porting-cuda-code-using-musify-toolkit
BusRevolutionary9893@reddit
There are also Chinese hardware manufacturers like Bambu Labs who basically brought the iPhone equivalent of a 3D printer to the masses. Children can download and print whatever they want right from their phone. From hardware to software, it's an entirely seamless experience.
GreatBigJerk@reddit
That's a piece of consumer electronics, different from a GPU.
A GPU requires drivers that need to be tested on an obscene number of hardware combos to hammer out the bugs and performance issues.
Also, I have a Bamboo printer that was dead for several months because of the heatbed recall, so it's not been completely smooth.
BusRevolutionary9893@reddit
Um... A consumer GPU is a consumer electronic. Your right that GPU drivers will take more testing on different hardware configurations but there is also a ton more money to be made with GPUs than 3D printers.
I've never had an issue with my P1S and two AMSs. You're not giving them their due credit for changing the market from hobbyist who like tinkerer with 3D printers to grandmothers with no technical experience being able to make crafts to sell at fairs.
GreatBigJerk@reddit
I didn't say the printer was bad... I love mine. That's why I waited for a replacement bed instead of getting a refund. It's just not flawless.
Anyway it seems like we agree on the GPU thing.
wektor420@reddit
Still having enough memory with shit support is better for running llms than nvidia card without enough vram
No_Hornet_1227@reddit
AMD is planing to give consumers a very strong APU with basically unlimited LPDDR5X-6 for AI next year or 2027. Then everyone will be able to do AI at home for a good price.
But yeah, AMD/Intel should wreck nvidia and release sub-1500$ 48gb vram ai gpus... whoever does that first will make a ton of money.
wektor420@reddit
And then tou remember that amd ceo is cousin if nvidia ceo
And intel is in deep problems
ifupred@reddit
Let them cook. Monopoly is horrible
simracerman@reddit
Care to share some sources?
GreatBigJerk@reddit
I don't have time to hunt around, but here's a Tech Tips video from a couple years ago: https://youtu.be/YGhfy3om9Ok
simracerman@reddit
LOL. don't have time to hunt around, pastes a 2 yo video in an industry that transforms monthly.
Don't have time for bogus allegations too.
Suitable-Economy-346@reddit
Are they not open source?
Emergency-Author-744@reddit
Based on specs it looks more like a 3090 on compute side and about 1/2 speed on bandwidth
am0x@reddit
Hardware is one thing, but the software underneath is what brings it to life.
Uncle___Marty@reddit
And for less than $100. This seem too good to be true?
TechySpecky@reddit
? Doesn't it say 13500 yuan which is ~1900 USD
ennuiro@reddit
seen a few for 9500 RMB which is 1350USD or so on the 96gb model
Glittering-Call8746@reddit
Where ?
Uncle___Marty@reddit
Yep, you're right. For some stupid reason I got Yen and Yuan mixed up. Appreciate the correction.
Still, a 96 gig card for that much is still so sweet. I'm just concerned about the initial reports from some of the chinese labs using them that they're somewhat problematic. REALLY hope that gets sorted out as Nvidia pwning the market is getting old and stale.
Sufficient-Past-9722@reddit
Fwiw it's the same word, like crowns & koruna, rupees and rupiah etc.
mintybadgerme@reddit
Er no it's not. Yen is Japanese. Yuan is Chinese. Two different countries, see? :)
Sufficient-Past-9722@reddit
Haha ok depending on the quant level, same token €:
mintybadgerme@reddit
Wow, slop.
Shiftiez@reddit
¥
Uncle___Marty@reddit
*learned something new today*
Cheers buddy :)
Ansible32@reddit
This isn't all that far from the Nvidia Jetson Thor which is only $3500.
TheRealMasonMac@reddit
Probably misread it as Yen.
smayonak@reddit
Am I reading your comment too literally or did I miss a meme or something? This is Chinese Yuan not Japanese yen, unfortunately. 13,500 Yuan is less than $2,000 USD, but importer fees will easily jack this up over $2,000.
Uncle___Marty@reddit
Nah, you read it correct. I'm wearing my stupid hat today apparently. Appreciate the good people of the sub correcting me :)
smayonak@reddit
Darn, I was hoping I was wrong or something. The Mi50 can be found for like 150 to 180, depending on the vendor and VRAM configs and those things have like 32GB. The Atlas card only uses 150 watts so $100 isn't completely unrealistic, if we only use VRAM as the benchmark for prices.
LatentSpaceLeaper@reddit
It's ¥13,500, so just below $1,900.
HarambeTenSei@reddit
it very likely isn't. Even the deepseek people couldn't get their models to train on these cards
Euphoric_Oneness@reddit
It's a fake news.
ennuiro@reddit
75% of 4090 or so on llama3 8b or so from what I've seen
stoppableDissolution@reddit
Still better than cpu and potentially 3090. Probably also a lot of drivers/software inefficiencies, too.
And tbh 75% of 4090 with 4x vram in one pcie is a great deal anyway. Hope there will be proper driver support.
fallingdowndizzyvr@reddit
It's not the same speed as the 4090. Why do you even think it does?
Hour_Bit_5183@reddit
They are smarter and bolder. I love china! The only reason 'murica is so mad is because we can't make shit without monopolizing and cucking....meanwhile they beat us in every category due to educating their people. There are like a billion trade classes you can take there, meanwhile we have BS institutes like harvard and like likes that prioritize sports which is really just gambling! FAIL!!!! I wish I were born in canada or anywhere else really.
apodicity@reddit
Huh? Look, your nationality says nothing about how intelligent you are or anything else. China does NOT beat "us" in every category. wtf are you talking about? Compared to the US, their pharmaceutical development is pathetic, for instance. Prioritize sports? Right. Harvard has over 10x the value in research facilities than athletic facilities.
https://finance.harvard.edu/files/fad/files/fy24_harvard_financial_report.pdf
Hour_Bit_5183@reddit
LOL you used big pharma as an example! You mean the one that destroys lives and causes more problems than they solve? LOLOL. They have way more schools for diff things and more in general. They do best us. Don't ever link me to scam universities ever again either! Don't even get me started about them.
fallingdowndizzyvr@reddit
Finally? The 300I has been available for a while. It even has llama.cpp support.
https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/CANN.md
profcuck@reddit
Not with 96gb though, right? That's the interesting development here as I understand it.
fallingdowndizzyvr@reddit
In China, they were talking about the 96GB model a few months ago. Also, it's $1500 on AE, not $2000.
But even at $1500, you are betting off getting a Max+ 395. For $1500 you can get a 96GB Max+ 395 with a modern CPU/GPU. That will perform much better. It's also a GPU like you can play games on it. This is a NPU and thus what you can do with it is much more limited.
Zyj@reddit
The AI Max+ 395 has half the memory bandwidth and with this card you can stick 6-7 of them onto a Threadripper Pro mainboard.
fallingdowndizzyvr@reddit
The Max+ 395 has more memory bandwidth. This is a dual card. It has 2xGPUs at 204GB/s. Contrary to the spec sheet claim, that does not add up to 408GB/s. Also, the Max+ 395 has more compute.
And you can daisy chain up a pile of Max+ 395s though USB4.
Zyj@reddit
The Ryzen has 50 TOPS int8, this has 280 TOPS
fallingdowndizzyvr@reddit
INT8? What software are you doing LLM on that runs INT8? It uses FP16, FP8 and on newer GPUs FP4.
Also, did you typo and put a 0 on to the end of that? Since according to Huawei the 310I has 22 TOPS INT8.
"up to 22 TOPS INT8"
https://support.huawei.com/enterprise/en/doc/EDOC1100079295/3656aeb1/performance
metallicamax@reddit
https://e.huawei.com/cn/products/computing/ascend/atlas-300i-duo
pjakma@reddit
So what is the software support for these? Are there open-source drivers and libraries for these GPUs?
ttkciar@reddit
Yes, pytorch and llama.cpp support Ascend NPU.
pjakma@reddit
Interesting. What's the underlying driver? Which library supports the hardware? I.e., what is the programming spec for the hardware?
ttkciar@reddit
CANN
JeffDunham911@reddit
Companies from China really know how to humble companies that sell overpriced shit in the US, if anything.
apodicity@reddit
That depends on the product. It's that simple.
AncientProduce@reddit
Uhuh, you honestly think this things legit? Or better than an existing card?
Do you swear by palit reliability?
Low_Cow_6208@reddit
Fk China. The only thing I like less then our bastard monopoliet is a totalitarian government monopoly that will betray a second they got their political goal achieved
apodicity@reddit
It doesn't matter wtf they do domestically from an international perspective because other players are always free to enter the market. People will keep buying from China so long as it's cheaper.
lightningroood@reddit
meanwhile the chinese are busy smuggling nvidia gpus
Prinzmegaherz@reddit
Didn’t Trump need to lift those export controls after the lost trade war?
Purple_Errand@reddit
Since when USA lost trade war? Lmao
People really just read what's on the top but never really do search. I'm not from USA, but stop riding the same waves, Brother.
apodicity@reddit
Trade wars don't have winners.
__some__guy@reddit
It's apparently 2 GPUs with 204 GB/s memory bandwidth each.
Pretty terrible, and even Strix Halo is better, but it's a start.
Ilovekittens345@reddit
I remember the time when china would copy western drone designs and all their drones sucked! Cheap bullshit that did not work. Completele ripoff. Then 15 years later, after learning everything there was to learn they lead the market and 95% of drone parts are made in China.
The same will eventually happen with GPU's, but might take another 10 years.
purpledollar@reddit
I think we’re past the point where it’s gonna take china 10 years to catch up. In fact I think we’re gonna see the roles reverse soon. Just look at their auto industry.
Ilovekittens345@reddit
I am specifically talking about surpassing Nvida, which probably means coming up with an ecosystem that does not run on cuda. That will take some time. More then 5 years. Plus if companies like ASML in the Netherlands for some geopolitical reason decide to cutoff China, china will have a problem. A solvable problem but a problem none the less.
apodicity@reddit
Yup. Ultimately, it's mostly a matter of time. Well, at least how that's this stuff usually has played out in the past. Like IBM in the early days of the PC market.
ChloeNow@reddit
You can also, in each industry, see a point where a cheap product improves to a "good enough" point that it becomes a bit ridiculous to buy the more expensive one given the price difference.
Pepeshpe@reddit
Good on them for not giving a crap about patents or any other bullshit.
Emergency_Beat8198@reddit
I felt Nvidia has captured the market because of Cuda not due to GPU
Tai9ch@reddit
CUDA is a wall, but the fact that nobody else has shipped competitive cards at a reasonable price in reasonable quantities is what's prevented anyone from fully knocking down that wall.
Today, llama.cpp (and some others) works well enough with Vulkan that if anyone can ship hardware that supports Vulkan with good price and availability in the > 64GB VRAM segment CUDA will stop mattering within a year or so.
gmdtrn@reddit
Many would argue that AMD does already. Not this good, but indeed the issue is CUDA. That edge NVIDIA has now won't last forever though. And, I hope it's sooner than later that they have to start playing nice.
Tai9ch@reddit
For ML, AMD doesn't really have any competitive offerings until you get up to the new server stuff (e.g. Radeon Instinct 300 series).
I really want their stuff to be good. I've been looking for a decent new ML card that isn't Nvidia for the past year. But AMD just won't do it. They won't significantly beat Nvidia on RAM. They won't break from their "$50 cheaper than Nvidia" price curve. And they won't take compute drivers seriously for gaming cards.
And AMD has the further problem that they're really competing not with new Nvidia cards but with used Nvidia cards. Would you rather have a RTX 3090 for $800 or an Radeon 7900 XTX for $1000? They've got the same VRAM and the cheaper one supports CUDA while the other one doesn't.
The server card market almost seems worse. AMD has several old cards that don't even seem to exist used, presumptively because they never really existed or sold new.
Galactic_Neighbour@reddit
To be fair, AMD is sometimes cheaper or sometimes has more VRAM, but sometimes it's both. The RX 7900 XTX had 24 GB while its competitor at the time RTX 4080 only had 16 GB. This has been a thing for a few generations now, but I feel like hardly anyone ever notices. In the current generation RX 9070 has 16 GB, but RTX 5070 only has 12 GB (and it's also more expensive and uses more power). But yeah, the software support in ROCm is terrible sometimes and the progress is slow. The RX 9070 is probably still not fully supported.
You can probably buy a used RX 7900 XTX too and it has a similar performance to RTX 4080. In games that is. In AI I have no idea, since AMD doesn't publish any benchmarks as far as I know and the popular tech media aren't competent in this stuff. And that's a part of the problem. People think that AMD cards don't work in AI or that they are very bad. I run LLMs and diffusion models just fine, though.
wrongburger@reddit
For inference? Sure. But for training you'd need it to be supported by pytorch too no?
EricForce@reddit
99% of the time a person getting into AI only wants inference. If you want to train, you either build a $100,000 cluster or you spend a week fine-tuning where the bandwidth is already the VRAM they have and I don't remember seeing any driver requirements for fine-tuning other than the bleeding edge methods. But someone can correct me if I'm wrong.
Tai9ch@reddit
If there were something like a PCIe AMD MI300 for $1700 but it only supported Vulkan we'd see Vulkan support for Pytorch real fast.
lodg1111@reddit
yup, AMD Instinct MI250 has 128GB ram. no one is paying attention to it.
ttkciar@reddit
Yes, because it requires an OAM bus interface. The MI210 is the last Instinct with a PCIe interface.
keepthepace@reddit
Yeah, no.
They have the fastest GPUs out there with the most VRAM. Cuda is a very shallow moat.
LLMs do not require a ton of complicated optimizations, especially if you are only targeting inference. There is so much need for it, FOSS will produce it in 2 weeks.
XeroVespasian@reddit
Hundreds of billions so far.. not vad for a shallow moat..
keepthepace@reddit
You can spend billions on things that are not moats
Conscious_Nobody9571@reddit
I wrote this the other day and my comment got downvoted by losers
fallingdowndizzyvr@reddit
CUDA is just a software API. Without the fastest hardware GPU to back it up, it means nothing.
Khipu28@reddit
If it’s “just” software then go build it yourself. It’s not “just” the language there is matching firmware, driver, runtime, libraries, debugger and profiler. And any one of those things will take time to develop.
fallingdowndizzyvr@reddit
And that stuff is developed. Why do you think CUDA is unique? There are plenty of APIs. In this case, there's CANN.
Again, it's the hardware GPU that made Nvidia what it is. Not CUDA. There are plenty of alternatives to CUDA.
Khipu28@reddit
If it’s so easy to develop then why does everyone struggle with it? Profilers for example AMD builds very potent hardware in many aspects it’s better than Nvidia but they fail to deliver in certain areas because there is no good profilers. AMD heavily relies on Sony to fill that gap for their games sector but there is nothing available for AI devs. Building that Software ecosystem is hard and it takes years to catch up.
fallingdowndizzyvr@reddit
Who struggles? Again there are plenty of APIs. You are mistaken in thinking that Nvidia is fast because of CUDA. Nvidia is fast because they make great hardware.
No. AMD, and Intel for that matter, have good hardware claims on paper. But in reality, they fail realize it.
Ah... what? AMD, a hardware manufacturer, relies on Sony, a gamer maker, to make games? Ah... yeah....
Khipu28@reddit
Software is not written it is debugged! And fast software is not fast because of the hardware it is fast because it was profiled. There are only a handful of programmers on this planet that have experience writing profilers and debuggers at that level needed to complete. Sony has that talent because they attract a certain type of software engineer and Nvidias cuda profiler is okay because they invested a lot of money over the years to get that talent.
fallingdowndizzyvr@reddit
Games are written by game developers. Not hardware makers. Sony is a game studio. AMD is not.
ComNguoi@reddit
I dont think you get what he is saying at all lmao
fallingdowndizzyvr@reddit
Then he's not saying it right.
Cold_Specialist_3656@reddit
Bingo.
AMD cards that cost 1/3 as much are just as powerful. But Nvidia has been perfecting CUDA for 20 years.
Awkward-Candle-4977@reddit
Ai uses tensorrt on the tensor core. Nvidia is able to make multi gpu cluster (really) works over lan
night0x63@reddit
lots of apple MLX people with full support for AI stuff even tho small market lol. so there's hope. specifically the $10k apply 512GB machines.
knight_raider@reddit
Spot on and that is why AMD could never give a fight. The chinese developers may find the cycles to optimize it for their use case. So lets see how this goes.
Salty_Flow7358@reddit
100%. But we shouldnt lose hope.
Phyzzx@reddit
It's good news that we have another player in the market like NBC.
recoverygarde@reddit
No point with M4 (current) and M5 macs about to drop
paul_tu@reddit
I wonder what software stacks does it support
Need to check
ttkciar@reddit
Pytorch and llama.cpp support it (Ascend NPU).
Laxarus@reddit
need to see the benchmarks
AdamScot_t@reddit
Good brand, huawei... but I am afraid doesnt seem like it will be cheaper, rather they will make it equally expensive as nvidia
catjewsus@reddit
Whats the performance like tho.... even if theres tons of vram if the actual card is slow then its not very competitive is it...
rdnkjdi@reddit
How much of a monopoly does Nvidia have on inference right now? It seems like everyone but grok and meta has their own inference TPU type tech except for training.
Which is insane with Nvidia beating earnings again
Weary-Wing-6806@reddit
These Huawei Atlas 300i Duo cards aren’t new tho. They’re 2022 datacenter pulls with 96GB LPDDR4x and low bandwidth. They’re fine for cheap inference where memory matters more than speed, but nowhere near a 4090 for performance or training. The bigger problem is software... CUDA dominates, while Huawei’s stack is still rough with driver issues and limited support. They look cheap on Alibaba, but imports get messy and prices double on eBay. Basically, lots of VRAM for little money, but you trade off speed and stability.
Unlikely_Ad1890@reddit
monopoly /mə-nŏp′ə-lē/ noun Exclusive control by one group of the means of producing or selling a commodity or service.
The GPU market is a group of companies that work together to line their pockets, they'd be considered an oligopoly
Western_Building_880@reddit
Watch us out tareifs.
TheL0ckman@reddit
Hey that and you'll be able to complain that is just a rebadged older device later.
ProjectPhysX@reddit
This is a dual-CPU card - 2x 16-core CPUs with 48GB dog-slow LPDDR4X @ 204 GB/s, and some AI acceleration hardware. $2000 is still super overpriced for this.
Nvidia RTX Pro 6000 is a single GPU with 96GB GDDR7 @ 1.8 TB/s, a whole different ballpark.
PraxicalExperience@reddit
The RTX Pro 6000 is also 4x the price...
Pulselovve@reddit
What are the expected performances with gpt-oss?
rotatingphasor@reddit
What software stack does this work on? I imagine it'd be difficult to get it working on things like Pytorch.
Actually just checked
https://pytorch.org/blog/huawei-joins-pytorch/
Seems like it's been something they've been working on for a while.
rotatingphasor@reddit
Memory doesn't mean much if it's slow (and doesn't have great software). I'd be curious about performance.
Tenxlenx@reddit
Ok but does it run CUDA?😅
horendus@reddit
Nvidia’s massive margins will be eroded over the next decade thats for sure
CeFurkan@reddit (OP)
Yep i agree
R_Duncan@reddit
performance comparison with 2*3090 and 2*5080/5090, ?
SaleAffectionate4314@reddit
only for referencing ?
rail_hail@reddit
Besides paddlepaddle what else can u run on this
FPham@reddit
I think I mentioned in my LoRA training book that at some point we will be smuggling GPUs from China, and here we are a month later.
Comprehensive_Ad5647@reddit
Wait for trump to protect the nvidia monopoly with a x100 tarrifs
Metrox_a@reddit
Now they just need to have a driver support or it's useless.
AmIDumbOrSmart@reddit
yup. even if you get through 2 pages of command lines and install the driver on linux, the cann branch of llama doesnt even support glm 4.5 yet (it has support up to 4 though). It also looks like there are some issues with that branch. I'll buy this when the support and software is actually there. Not holding the bag today. It would be cool if some people bought these to finetune on, even if theyre bad for it it's probably the best bang for buck and someone with more technical know-how may do well with them. Not me though.
HugoCortell@reddit
They don't. Does not run on Windows, nor does llama.cpp support CANN.
This is literally like AMD's AI offering (same price, and with better specs, if I recall), it's cheap on the consumer because it's not really all that useful outside of Linux enterprise server racks.
NickCanCode@reddit
Of course they have driver support (in Chinese?). How long it takes to catch up and support new models is another question.
KaleidoscopeOk3416@reddit
at most 1 year
Ratiofarming@reddit
Yeah, but it's considerably slower. Not that I hate the development, more GPUs -> more better, but comparing this with Nvidia is just bs. A RTX 6000 Pro Blackwell eats this thing for breakfast when it comes to speed, it's not even close. Let alone software support being a huge issue with these things.
nickpsecurity@reddit
Don't forget Tenstorrent Blackhole cards. They claim A100 performance at $999. You can also put many in a machine.
pacificdivide@reddit
I’ve been closely following all of China’s GPU strategies check out russwilcoxdata.substack.com
Beginning-Art7858@reddit
What kind of llm could this run? If you wanted to stay local and didn't care if it was super fast?
Potential-Leg-639@reddit
But does it run Crysis?
burheisenberg@reddit
Nvidia has CUDA for GPU computing. Do these GPUs have libraries and support for usability? What is compatibility? IMAO, it does not make sense to buy one of those.
Emergency-Author-744@reddit
Supported in llama.cpp via CANN: https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#cann
burheisenberg@reddit
So, it cannot be used for general DL tasks but it can be dedicated for LLM projects. Better than nothing.
M3GaPrincess@reddit
Deepseek already publicly declared that these cards aren't good enough for them. https://www.artificialintelligence-news.com/news/deepseek-reverts-nvidia-r2-model-huawei-ai-chip-fails/
The Atlas uses 3 Ascend processors, which Deepseek says are useless.
Cuplike@reddit
They still use them for inference which is what most people here would use them for aswell and a new report just came out stating they use them for training smaller models
KefkaFollower@reddit
That's what I was wandering. Not good enough for inference or not good enough for training?
Thanks for the clarification.
M3GaPrincess@reddit
"New report" = some propaganda. I guarantee you it won't run llama.cpp, tensorflor or pytorch.
It's crapware.
NickCanCode@reddit
Just tell me how these card are doing when compared to AMD 128GB Ryzen Max AI Which is roughly the same price but as a complete PC with AMD software stack.
Emergency-Author-744@reddit
Basic spec comparisons
Atlas 300i vs AMD 395+ AI Max 408 GB BW vs 256 GB BW (60% better TPS theoretical) 140 TFlops fp16 vs 59 TFlops fp16 (240% faster PPS theoretically, though this is very software dependent) 96GB LPDDR4X vs 128GB LPDDRX5 (96 VRAM max allocation)
Both are supported by llama.cpp, AMD by Vulkan too, unsure if atlas works on vulkan atm.
Ansible32@reddit
I'd also compare to the Jetson Thor Nvidia just announced. Which is $3500 but I think the extra cost is probably worth it.
tat_tvam_asshole@reddit
better bandwidth, that's about it. strix halo has decent support and you can run just about whatever you want, including comfyui
NickCanCode@reddit
It is a no for me then. Most I want to do is Image generation and coding task. Seeing that this card don't have proper heatsink, it probably lacks processing power for image gen.
tat_tvam_asshole@reddit
Its fine for image gen I'm sure, just not as fast. you should only consider a card for the vram to the degree you will be running large models/workflows. for the price point, imo the new strix halo boards are a better deal, but aren't a pcie card obviously. after that maybe two b60s once they're out.
kei-ayanami@reddit
I think that is still a better value for now, import costs included at least.
Meiyo33@reddit
You also need the compute power and drivers.
They are not here for the moment.
iyarsius@reddit
Hope they are cooking enough to compete
JFHermes@reddit
This is China we're talking about. No more supply scarcity baybee
No-Underscore_s@reddit
B..b..but tarrifs
/s
PlasticAngle@reddit
even with tarrif it should still cost less than half.
ChloeNow@reddit
AI is the top industry right now with NVidia being at the top of the game. We have a racist president who's scared of China and just figured out who Jensen Huang was a few weeks ago.
Don't be surprised if a special tax is placed on these to make sure they're equally expensive to NVidias stuff. Market forces outside the US should still drive price down, though.
Girafferage@reddit
And have abysmal support. Good luck getting drivers.
stumblinbear@reddit
If it's half the price for comparable performance, there will be support in the near future
Girafferage@reddit
I'll wait until the support exists but then yeah it would be awesome.
Present_Hawk5463@reddit
The US has a monopolistic control over the semi conductor market, they control the upstream supply for manufacturing of these GPUs.
nedockskull@reddit
Isn’t Huawei banned in the USA?
ChloeNow@reddit
Honestly thank FUCKING god.
Look even if you're an America stan, MAGA hat wearer, hate China, whatever... you gotta admit some actual competition would be good. The prices on modern GPUs (chips in general) have been absolutely insane and being driven up by things that it sucks to share a space with (bitcoin mining ops previously, now more likely AI datacenters)
Educational_Belt_816@reddit
Why am I seeing this exact post word for word with the same image attached reposted on multiple subs and multiple large X accounts? The botting is so glaringly obvious this is like the 6th time i've seen this today
wakigatameth@reddit
what happens if you render Tiananmen square on one of those babies
Designer-Ganache8097@reddit
What happens when Americans even figure out how to stop being conned by their government?
RedditJumpedTheShart@reddit
When will China make safe baby food?
Designer-Ganache8097@reddit
??
__BlueSkull__@reddit
I believe its marketing 96GB is not true. Internally, it has two 48GB NPUs connecting through PCIe, so say goodbye to memory bandwidth cross chip.
Educational_Smile131@reddit
Huawei nowadays is best known for overhyped and underspec’d products, it’s amusing so many people here fell for this propaganda lol
MCH_2000@reddit
There is no monopoly abuse.
The China GPU is far inferior in hardware. And it’s 96 GB because it’s using LPDDR4X.
FearThe15eard@reddit
Lets go i can finally build my PC
nenulenu@reddit
Not to drag politics in, but seems necessary. This is just a sign that Trump is right and US should have been careful with manufacturing high technology in China, which reversed engineered and got his engineers educated in the US to get more than enough skill as import. Now, they are setup enough produce practically anything at a fraction of the price Should drive competition.
The question is going to be how much buyers are going to take with these products. May be they will be like UK and take massive risks for cheaper pricing. May be they will be more careful in understanding long term risk.
But if I learning anything, we all lost ability to think long term. So, risk taking it is!!!
X2ytUniverse@reddit
I mean they can enter the market all they want, but sheer memory capacity means nothing if it runs on 10 year old tech that doesn't support modern features like CUDA, or if the general performance is abysmal.
HlddenDreck@reddit
So, what frameworks support this? None? Well, I guess it's useless at the moment. But in the future this might be interesting
ILikeQuantum@reddit
I don't see myself buying it but if it keeps Nvidia prices in line I'll be happy.
Ok_Warning2146@reddit
Don't put too much hope on Huawei. They just folded their LLM division due to fraud.
miki4242@reddit
Do you have any credible sources for this?
kaggleqrdl@reddit
The cope on this thread is legion. China is ALL IN. The only thing that will keep them back is import bans and blockades. Or maybe they will deny export because they don't want the US to catch up... lol
RedditJumpedTheShart@reddit
So you ordered one? Put your money where your mouth is lol
757DrDuck@reddit
How so? Or is it just what they all believe?
Better-Cricket-7883@reddit
all the world
MedicalScore3474@reddit
They've already done this with Huawei Ascend processors: https://www.huaweicentral.com/us-imposing-stricter-rules-on-huawei-ai-chips-usage-worldwide/
If they are considered to be produced in violation of US export controls like the Ascend processors, US citizens will not be allowed to use them or buy them anywhere in the world, else you will face criminal and civil penalties.
Alihzahn@reddit
Because there's so much free speech happening in the US currently. I'm no CCP shill, despise them even. But it's actually funny seeing people call out China when people are getting arrested left and right for free speech in the west. And the upcoming draconian spying laws.
sailee94@reddit
Actually, this card came out about three years ago. It’s essentially two chips on a single board, and they work together in a way that’s more efficient than Intel’s dual-chip approach. To use it properly, you need a specialized PCIe 5.0 motherboard that can split the port into two x8 lanes.
In terms of performance, it’s not necessarily faster than running inference on CPUs with AVX2, and it would almost certainly lose against CPUs with AVX512. Its main advantage is price, since it’s cheaper than many alternatives, but that comes with tradeoffs.
You can’t just load up a model like with Ollama and expect it to work. Models have to be specially prepared and rewritten using Huawei’s own tools before they’ll run. The problem is, after that kind of transformation, there’s no guarantee the model will behave exactly the same as the original.
If it could run CUDA then that would have been a totally different story btw..
xugik1@reddit
Is there a newer version of this card now?
CeFurkan@reddit (OP)
To answer all questions. CUDA is not a wall or MOAT, AMD doesn't have CUDA but their cloud GPUs on Linux running well. What AMD lacks is competency. They didn't sell same price 3x VRAM GPUs. Their GPUs same price ridiciliously. So what Chinese GPU makers need?
They only need to pull request Pytorch to natively support the GPUs. Thats it. They can do it with software team. Moroever, a CUDA wrapper like ZLUDA and you are ready to roll. Currently VRAM or GPU can be weak but this is just the beginning.
noiserr@reddit
AMD has been putting in a lot of effort into ROCm. And the Linux drivers.
CeFurkan@reddit (OP)
and AMD fails because they are still 100% targeting cloud service providers which is cutting their revenue chance. Instead they could sell gaming GPUs from 96 GB with same price and ROCm and full Windows support and they would bloom in the entire ecosystem. Even cloud service providers would buy them huge
noiserr@reddit
udna is comming
Formal_Bat_3109@reddit
What other hardware do you need to run this?
artofprjwrld@reddit
u/CeFurkan, competition from China’s 96GB cards under $2k is huge for AI devs. Finally, u/NVIDIA’s monopoly faces real pressure, long term market shifts look inevitable.
CeFurkan@reddit (OP)
100% it is beginning
Zealousideal_Meat297@reddit
Do we have to worry about spyware on these cards?
Interesting-Law-8815@reddit
By the time the orange moron has put tariffs on then they’ll be $20,000 each
RahimahTanParwani@reddit
Yes, finally! Nvidia is a cutthroat company, no different from the Jewish Nazi companies of Meta, Apple, Google, Amazon, and Tesla. If you have NVDA stocks, sell them now and reap the profits. It will free fall in the coming weeks.
Suppe2000@reddit
Is this legit? In Taobao I find them for about 9000 RMB. That seems quite cheap to me. I just went to Shenzhen to find a good vendor for the 4090 48gb. But these Huawei cards are crazy
ThePi7on@reddit
Competition is always welcome
juggarjew@reddit
So what? It does not matter if it can not compare to anything that matters. The speed has to be useable. Might as well just get a refurb Mac for $3000 with 128GB RAM.
thowaway123443211234@reddit
Everyone comparing this to the Strix misses the point of this card entirely, the two important things are:
Queasy_Comedian274@reddit
it isn't a gpu. it's purely neural processing
thowaway123443211234@reddit
Semantics
Queasy_Comedian274@reddit
no video encoder/decoder and no vulkan or opengl :/
Darlanio@reddit
Looking forward to test this card out!
ProtolZero@reddit
That is not a GPU but a npu card.
Sudden-Lingonberry-8@reddit
if drivers are open source, it's game over for nvidia overnight
CeFurkan@reddit (OP)
i hope they do that
pmttyji@reddit
Hope this helps price down for AMD cards too(apart from NVIDIA, Intel).
zd0l0r@reddit
Time to sell my nvidia shares?
fantom1252@reddit
Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08
But now look at this ....
fantom1252@reddit
Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08
But now look at this ....
Overview
This document describes how to install and uninstall software packages and provides FAQs and troubleshooting methods.
This document applies to:
Intended Audience
This document is intended for:
Symbol Convention
Symbols that may be found in this document are defined as follows:
just wonder why is that written there ? what kind of a material they have used there its interesting .....
fantom1252@reddit
Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08
But now look at this ....
Overview
This document describes how to install and uninstall software packages and provides FAQs and troubleshooting methods.
This document applies to:
Intended Audience
This document is intended for:
Symbol Convention
Symbols that may be found in this document are defined as follows:
just wonder why is that written there ? what kind of a material they have used there its interesting .....
fantom1252@reddit
Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08
Overview
This document describes how to install and uninstall software packages and provides FAQs and troubleshooting methods.
This document applies to:
Intended Audience
This document is intended for:
Symbol Convention
Symbols that may be found in this document are defined as follows:
just wonder why is that written there ? what kind of a material they have used there its interesting .....
fantom1252@reddit
Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08
https://support.huawei.com/enterprise/en/doc/EDOC1100349483?idPath=23710424|251366513|22892968|252309139|252823107?
Overview
This document describes how to install and uninstall software packages and provides FAQs and troubleshooting methods.
This document applies to:
Intended Audience
This document is intended for:
Symbol Convention
Symbols that may be found in this document are defined as follows:
Danger https://support.huawei.com/enterprise/en/doc/EDOC1100349483?idPath=23710424|251366513|22892968|252309139|252823107
just wonder why is that written there ? what kind of a material they have used there its interesting .....
pmv143@reddit
This is pretty wild that Huawei is putting out a 96GB card for under $2K. Since it’s built for inference, it makes sense the raw throughput isn’t on par with a 4090 , but its a steal for memory-heavy workloads.
But the catch is the same as always. cheaper hardware doesn’t fix cold starts or idle GPUs. If you can’t get sub-2s load times and keep utilization high, you’re still burning capacity. That’s where the real bottleneck is.
trahloc@reddit
Huawei needs to sponsor the Zluda project for their cards.
kc858@reddit
currently in china, just bought two, bringing them back if anyone wants to buy the other one for 3750 lol
Professional_Mix2418@reddit
Huh, why is a card from 2022 being presented like it just entered the market? And why, compared to a RTX 6000 Pro? It's not even close to a 3090. What on earth is this about?
TheToi@reddit
408GB/s bandwidth means 6 tokens by second on 70B models Q8, it's too slow for that price to me.
Anthonyg5005@reddit
Well to be fair, it is only an npu so I don't expect to be able to get any graphical use out of it
-Hakuryu-@reddit
Might want to calm down first
-Hakuryu-@reddit
Last i check they were ddr4.....
Rahul_Albus@reddit
How well do work with Cuda????
Abyss_Kraken@reddit
let’s fucking gooooooooooooooo
LARGEBBQMEATLOVERS@reddit
Would have the same issues with AMD, hardware might be fine but the drivers..?
Toasted_Treant@reddit
They're still way behind, but they have momentum. 5 years and they will have stolen enough nvidia and amd tech to make a comparable product to nvidia's flagship.
smokeynick@reddit
Ok CCP. Good work 😉
IngwiePhoenix@reddit
So THAT was their big card. I looked at Atlas a while ago once I spotted it as supported in llama.cpp - this is super interesting stuff. Looking forward to benchmarks!
HighlandEvil@reddit
Does it run triton? or is triton just for AMD mostly?
sleepingsysadmin@reddit
linux kernel support? rocm/cuda compatible?
fallingdowndizzyvr@reddit
It runs CANN.
Careless_Wolf2997@reddit
what the fuck is that
fallingdowndizzyvr@reddit
LOL. Ask a LLM.
florinandrei@reddit
More reliable than social media users anyway.
remghoost7@reddit
Here's the llamacpp documentation on CANN as per another comment:
CANN (Compute Architecture for Neural Networks) is a heterogeneous computing architecture for AI scenarios, providing support for multiple AI frameworks on the top and serving AI processors and programming at the bottom. It plays a crucial role in bridging the gap between upper and lower layers, and is a key platform for improving the computing efficiency of Ascend AI processors. Meanwhile, it offers a highly efficient and easy-to-use programming interface for diverse application scenarios, allowing users to rapidly build AI applications and services based on the Ascend platform.
Seems as if it's a "CUDA-like" framework for NPUs.
Gold-Vehicle1428@reddit
the api is actually is where gpus fuck customers, look at rocm.
sleepingsysadmin@reddit
What im hoping for is that april 2026 will be a big deal for rocm. vulkan for now i guess :(
Unlikely-Employee-89@reddit
Pls don't buy it. Chinese GPUs are not safe. Surely there must be some backdoor that allows CCP to steal your data. Also, their technology is not mature enough. Let's punish those evils and let them scale up and lower down the price for people like me don't give a fuck. I need USA to MAGA for the rest of the world. USA! USA! USA! 🙏
T-VIRUS999@reddit
Didn't take long for those to get scalped, they're already the price of a used car on eBay, like 3X MSRP, shameful
mrw981@reddit
Also has the added benefit of spying on you for the CCP.
howie521@reddit
Can this run together with Nvidia hardware on the same PC?
NebulousNitrate@reddit
China is blowing the US out of the water with their latest tech. It’s one reason Intel is about to go bottom up, the next generation of processors will have the victory go to Chinese chip manufacturers. Intel/AMD will not be able to compete unless Chinese imports are blocked, and that’s what the Trump administration is gearing up for as they take a stake in Intel.
noiserr@reddit
It's a dual GPU solution. Which already limits it for running large LLMs. And it has lass bandwith than Strix Halo for each NPU. It's also more expensive than Strix Halo, and it doesn't have anything close to ROCm.
phear_me@reddit
Some of you don’t know anything about technology and it shows
kierowniku@reddit
I hate monopoly, I don't like china tho I've got mixed feelings
Jisamaniac@reddit
Doesn't have Tensor cores....
noiserr@reddit
Pretty sure it's all tensor cores, it doesn't have shaders. Tensor core is just a branding for matrix multiplication units and these processors are NPUs which usually have nothing but matrix multiplication units (or tensor cores).
AdventurousSwim1312@reddit
Yeah, the problem is that they are using lpddr4x memory on these models, your bandwitch will be extremely low, it's more comparable to a mac studio than a Nvidia card
Great buy for large Moe with under 3b active parameters though
satireplusplus@reddit
From their official website: LPDDR4X 96GB or 48GB, total bandwidth 408GB/s Support for ECC.
That's not extremely low. It's on par with a nvidia 5060 or 40% of a 4090.
I'm guessing driver / GPGPU API support will be the real problem initially.
AdventurousSwim1312@reddit
Yeah, but for a 96gb card, unless you want to deploy many instances of small LLM, this will be very limiting (the 1.7 tb/s on rtx 6000 pro is already what is the most frustrating part, having hbm memory with around 3-3.5 would be much better)
satireplusplus@reddit
It's still way faster than DDR4 or DDR5. You can read the entire 96GB 4 times per second, so you'd get at least 4tok/s if you need to read the entire thing for a really large model. But since many models are MoE now, there is less than that in terms of active parameters that needs to be read per token. So you'd be probably pushing 10+tok/s for inference.
This sits somewhere in between a Mac Studio with the M4 Max (546 GB/s) and a M4 Pro (273 GB/s). But it probably has way more compute, so you could use it for training and fine-tuning as well. In theory at least, in practice PyTorch support will be lacking a lot of features. Even Apple's MPS backend still has lots of missing ops that the CUDA backend has. So where's the uphill battle that any GPGPU contender faces.
sid_276@reddit
This. Imagine this. You have larger highways but actually more narrow. Traffic only congests more. For machine learning these are kind of garbage NGL
uti24@reddit
If true it's almost half of the bandwidth of 3090, and 1/3 highter of that in 3060.
Star_king12@reddit
That's like DDR5 speeds, no? Regular DDR5 on dual channel I mean
uti24@reddit
fastest DDR5 on dual channel is like.. 125GB/s
Star_king12@reddit
And 3060 12 gig is 360 GB/s so I'm right on point.
TheDreamWoken@reddit
Then i guess it would run as fast as Turing archicture? I use a titan rtx 24gb, and can max out to 30 tk/s on a 32b model
shing3232@reddit
280 TOPS INT8 LPDDR4X 96GB或48GB,总带宽408GB/s
__some__guy@reddit
It's dual GPU with only 204 GB/s each.
shing3232@reddit
if it is connected on boare, it should be fine for inference
No_Hornet_1227@reddit
Really crap bandwidth + crap performance (300 tops at most)... yeah... if you could somehow use the vram on that card and use a 5090 gpu core + vram for the AI or something, thatd be freaking amazing.
So a 5090 32gb + 1 of these youd have 128gb of vram at 3700+ tops
Tenzu9@reddit
Yes, you can test this speed yourself btw if you have a new android phone with that same memory or higher. Download Google's Edge app, install Gemma 3n from within it and watch that sucker blaze through it at 6 t/s
stoppableDissolution@reddit
Thats actually damn impressive for a smartphone
MMORPGnews@reddit
It is, I just hope to see gemma 3n 16B, without vision (to reduce ram usage). General small models useful only with 4B+ params.
poli-cya@reddit
Doesn't that mean nothing without the number of channels? You could run a ton of channels of DDR3 and beat GDDR6, right?
Wolvenmoon@reddit
Ish and kind of. More channels means more chip and PCB complexity and higher power consumption. Compare a 16 core Threadripper to a 16 core consumer CPU and check the TDP difference, which is primarily due to the additional I/O, same difference w/ a GPU.
andy_a904guy_com@reddit
For people curious that means 10x slower in speed alone.
dltacube@reddit
Even against the M3 ultra with 800GB/s memory? That’s half of 1700 on a 5090 card. Or the scale not linear?
slpreme@reddit
under 3b 😬
Comed_Ai_n@reddit
Problem is all these Chinese cards don’t work with CUDA acceleration.
simple123mind@reddit
And I'm sure your data doesn't go to China
critacle@reddit
Comes with free Salt Typhoon
Born_Highlight_5835@reddit
need reviews asap lol
vito0117@reddit
Someone tagg gamer nexus lmao
rizuxd@reddit
Finally monopoly will be broken
Alternative-Bobcat-5@reddit
96gb of VRAM?
Anyone in London wanna split a flight to China, go buy some server farms for ourselves?
End3rWi99in@reddit
Nice paperweight.
paul_tu@reddit
Fast search have me a reason for wait for Ascend 920 with HBM (and memory speed it gives) as Atlas 300i accelerators were released back in 2022
No-Emu-396@reddit
Comes with spyware included at no additional charge ? Was bad enough when Facebook bought Occulus, bloody hell!
UsualResult@reddit
Hoo hoo. This is the best piece of news I've seen in a while. If this is real and it has any software support at all, they are going to have to make Huawei illegal to sell in the USA. At 20% the price of NVIDIA you can get almost 0.5TB with 5 of these guys for the same price as 1 from NVIDIA. I think the gravy train is close to heading off the end of the bridge.
epiktet0s@reddit
comes with a free trojan
Popular_Brief335@reddit
Lol oh good Chinese gpu propaganda has arrived
minitoxin@reddit
Atlas 300V Pro 48GB LPDDR4X 204.8 GB/s 140 TOPS INT8 / 70 TFLOPS FP16 150W Single slot PCIe
Musicheardworldwide@reddit
Really not versed on the subject, but isn't the whole thing was nvidia the software? Isn't cuda a factor in this because it's so widely used?
e79683074@reddit
Competition is healthy, but what ensures me I'm not putting hardware spyware in my build? This thing literally has direct RAM access and the country is well known for putting backdoors in things.
chinese__investor@reddit
how much spyware in nvidia and app0le?
Striking-Warning9533@reddit
Yeah, as a Chinese I am paranoid and will avoid this.
shibe5@reddit
As for the hardware, when you have spyware in CPU (AMD PSP) or MB (Intel ME), might as well go with bugged NPU.
As for the software, I'll wait for reviewed open source drivers and libraries.
xxPoLyGLoTxx@reddit
Just disable the GPU in your windows firewall /s
BlueArcherX@reddit
you guys are easy marks
m1013828@reddit
a for effort, big ram is usefull for local AI, but the performance.... i think id wait for next gen with even more ram on lpddr5x and at least quadruple the TOPS, a noble first attempt
EpicOfBrave@reddit
Good luck convincing US and European software companies, and especially the governments, to support Chinese hardware!
USA is accountable for 95% of the global AI spending.
The chinese AI chips, just like the chinese cars and smartphones can’t grow more than 1-5% market share in US, will never be let to grow in US and Europe.
Designer-Ganache8097@reddit
China will be happy to sell their products to everyone else. They don’t need the US. There will come a point where the US has to choose to be a part of the world rather than trying to be dominant.
snowbirdnerd@reddit
I'm always a little weary of the Chinese knockoffs. Specs on paper are one thing but I'll wait until I see some performance reviews.
SnooRecipes3536@reddit
gentlemen, we are BACK
ZookeepergameOdd4599@reddit
At this point it is cheaper to just immigrate to Singapore
meshreplacer@reddit
I welcome China with open arms if they can crush this monopoly by introducing fairly priced cards. The big hold up with AI to the consumer level is the lack of fairly priced HW. The primary goal is to keep AI as a pay service like Cable TV. Harkens back to the Pre PC era where people had to spend bucks for an ASR-33 terminal and pay per the minute for "Time sharing" compute time.
Fluid-Pea7891@reddit
🐻🍯
Believe it or not , calls
Just-Health4907@reddit
what store is this?
Resident-Dust6718@reddit
I hope you can import these kinds of cards because I’m thinking about designing a nasty workstation set up and it’s probably gonna have a nasty Intel CPU and a gnarly GPU like that
tat_tvam_asshole@reddit
Radical, tubular, my dude, all I need are some tasty waves, a cool buzz, and I'm fine
munkiemagik@reddit
All of a sudden now I want to re-watch the original Point Break movie.
Ok_Top9254@reddit
I don't understand why people are blaming Nvidia here, this is business 101, their GPUs keep flying off shelves so naturally the price increases until equilibrium.
The only thing that can tame prices is competition which is non-existent with Amd and Intel refusing to offer a significantly cheaper alternative or killer features, and Nvidia themselves aren't going to undercut their own enterprise product line with gaming gpus.
Amd is literally doing the same in cpu sector, HEDT platform prices quadrupled after Amd introduced threadripper in 2017. You could find 8 memory slot and 4x PCIe slot x99/x79 boards for under 250 bucks and CPUs around 350. Now cheapest boards are 700 and CPU literally 1500$. But somehow that's fine because it's Amd.
kaggleqrdl@reddit
Pretty sure if you follow the money the powers that be control both NVidia and AMD and don't want them to compete.
stumblinbear@reddit
AMD put Intel out of business but apparently they don't want to compete with Nvidia for Reasons™
Wild take
_bachrc@reddit
Didn't Deepseek said they had issues with Huawei cards, and that it caused their multiples delays on R2?
farnoud@reddit
The entire software ecosystem is missing. Not a hardware problem.
Glad to see it but takes years to build the software ecosystem
QbitKrish@reddit
This is quite literally just a worse strix halo for all intents and purposes. Idk if I really get the hype here, especially if it has the classic Chinese firmware which is blown out of the water by CUDA.
Mundane-Light6394@reddit
How many of these strix halos can you put in a 4u chassis?
layer4down@reddit
In the end, our greed-optimized brand of capitalism will have defeated itself.
RG54415@reddit
Cheaper always wins.
Patrick_Atsushi@reddit
I wonder how will these do with gaming.
Mundane-Light6394@reddit
It's not made for gaming, it doesn't even have it's own cooling. It's made for servers to run AI inference. But if cheaper alternatives become available for AI less gaming capable GPU's will be needed for AI. It's indirect competition for gaming GPU's and it could push prices for gaming GPU's lower if it lowers demand for gaming GPU's for AI purposes.
AI is pushing up prices for gaming GPU's now like etherium mining did in 2017.
CeFurkan@reddit (OP)
this is also super important. these gpus must run games on windows PCs to become wide spread
Patrick_Atsushi@reddit
I’m not sure about this. The normal use case of GPU in this spec is LLM training.
However the price makes me wonder if some people will buy it for gaming.
I think the gaming market is almost ignorable when compared to AI market. I might be wrong though.
Upbeat_Parking_7794@reddit
Nice, US will tax them to hell, but the rest of the world will have cheap AI.
No_Hornet_1227@reddit
Ive been saying for months, the first company, nvidia, intel or amd that gives consumers an AI gpu for like 1500$ with 48-96gb of vram is gonna make a killing.
FFS 8gb of vram chips of gddr6 costs like 5$. They could easily take an existing gpu triple the vram on it (costing them like 50$ at most and sell it for like 150-300$ more and they would sell a shit ton of em.
Mango-Vibes@reddit
Except for the fact that Nvidia is the most efficient no questions asked and also is supported by virtually everything
tryingtolearn_1234@reddit
Intel should have done this. Instead a Chinese company will get that market.
WithoutReason1729@reddit
Your post is getting popular and we just featured it on our Discord! Come check it out!
You've also been given a special flair for your contribution. We appreciate your post!
I am a bot and this action was performed automatically.
RemoteHoney@reddit
DeepSeek switched from Nvidia to this
So now they are left behind in competition
CHEWTORIA@reddit
China has 1.4 Billion people, that is huge market.
India has 1.4 billion people, that also is huge market.
Its all in Asia.
They will take over whole Asian market, if you dont see this happening, its coming.
yes, its not good as CUDA, yet, but that is mostly software issue.
Give it 5 more years and this problem will be solved.
happy-occident@reddit
Doesn't the mindspore issue get in the way of building locally? It doesn't play with ollama apparently?
Bestoftherest222@reddit
This only the beginning, its going to get better
OriginalAd9933@reddit
Different?
Comfortable-Cry8165@reddit
Has anyone tested them with local AIs? Any guides or videos? How are they compatible with the existing hardware?
What about gaming?
spookyclever@reddit
All I want is for this to cause the nvidia price point to dip to retail levels 😄 I’ll buy one of these to inflate interest if it means in a couple weeks I can get a 5090 at retail.
Anyusername7294@reddit
If I had to guess, I'd say they are slower and far more problematic than DDR5 or even 4 with similar capacity
devshore@reddit
Its China, so they are probably 980 TIs glued together, or rebranded P40s.
Anyusername7294@reddit
So how they are making profit out of them?
_lindt_@reddit
Oh boy, get ready for another week of eagles and flag waving.
Wonder if they’ll start working on their own framework too.
xxPoLyGLoTxx@reddit
Hell yes! Is it wrong of me to be rooting for China to do this? I'm American but seriously nvidia pricing is outrageous. They've been unchecked for awhile and been abusing us all for far too long.
I hope China releases this and crushes nvidia and nvidia's only possible response is lower prices and more innovation. I mean, it's capitalism right? This is what we all want right?!
devshore@reddit
is that even slower than using a Mac Studio?
xxPoLyGLoTxx@reddit
It's certainly slower than an m3 ultra (I think that's around 800 GB/s). I think an M4 Max (what I use) is around 400-500 GB/s but I don't recall.
arcanemachined@reddit
Competition is always good for the consumer.
chlebseby@reddit
Its not wrong, US need competion for progress to keep going. Same with space exploration, things got stagnant after ussr left the game.
MaggoVitakkaVicaro@reddit
Aren't these the chips which delayed DeepSeek's recent release, because the PRC forced them to try to use them for AI training?
Striking-Warning9533@reddit
It cannot do training only inference
fsactual@reddit
Amazing what you can do when shareholder value comes second instead of first.
DanielKramer_@reddit
huawei is also in the business of making money if you didn't know
you make money by making people want to give you money, that's how making money works
amazon sucks me off when i return things, not because jeff is my buddy, but because it keeps me in his world
nvidia will lower prices as soon as they have competent competition. right now you can moan all you want but you will still buy nvidia so they don't care yet. eventually we'll all be cheering 'good guy nvidia!' just like we are now rooting for intel after they caused a decade of quad core stagnation
prusswan@reddit
Huawei is also big into AI, gonna see how they advertise this and if they are not even using their own products...
LMFuture@reddit
Glad to see that but I'll be happier if it's from other Chinese/us companies. Like 寒武纪 or google/groq. Because Huawei lied to us in harmonyos and pangu models, I just hate them
Conscious_Cut_6144@reddit
From the specs this is probably the reason we don't have Deepseek R2 yet :D
CryptographerCrazy61@reddit
lol I’d buy one today if I could get my hands on it
vulcan4d@reddit
One day it will be competitive and this is why Donnie loves Tariffs, he is protecting his buddies and their profits.
PathIntelligent7082@reddit
ppl say they're garbage
Fulcrous@reddit
It’s $2000 because it’s not competitive at all.
prusswan@reddit
From the specs it looks like GPU with a lot of VRAM with performance below Mac Studio.. so maybe Apple crowd will sweat? I'm actually thinking of this as a RAM substitute lol
nonofanyonebizness@reddit
Is that a single slot construction? Hmm I want 7 of them.
Illustrious-Dot-6888@reddit
https://i.redd.it/qb8tsl7be7mf1.gif
Real_Back8802@reddit
Oh good I have Nvidia stocks. Nvidia DO SOMETHING!
Good_Performance_134@reddit
Years late performance along with no CUDA...
Right..
mummifiedclown@reddit
As someone who’s had to force engineers to access their Huawei servers headless because there were NO Linux video drivers for them, I find this hilarious.
2Gins_1Tonic@reddit
You get what you pay for…
WaffleTacoFrappucino@reddit
i got a quote for an rtx 6000 for $7700
clbgrg@reddit
Definitely doesn't have any spyware inside. Jackie Chan says so
Familiar_Text_6913@reddit
As with everything else Cheap&Chinese: HW will be golden... But support, software, quality assurance etc... Will suck ass. I love Chinese tech for the price but this has been the case for atleast 20 years.
MrMnassri02@reddit
Hopefully it's open architecture. That will change things completely.
Khipu28@reddit
If they expose their entire software stack like tenstorrent does and you are a tinkerer with low level software engineering background than go for it! Otherwise stay away!
serendipity777321@reddit
Deepseek is very good. If it's efficient this is excellent. Nvidia should stop being greedy
LostMitosis@reddit
"GPU's from China threaten our national security". A headline coming soon from your favourite media and politician.
floridianfisher@reddit
They will catch up and it will crush NVIDIA
oodelay@reddit
Should I sell my Nvidia stocks?
Holyragumuffin@reddit
They have heat dissipation issues and lower nm process - I hear.
Defiant_Diet9085@reddit
1 slot, no fan
4GPU on board
Performance 1/10 of 3090
88 TOPS INT8 and 44 TFLOPS FP16
shing3232@reddit
that's the slow one
This one is
280 TOPS INT8
140 TFLOPS FP16
LPDDR4X 96GB或48GB,总带宽408GB/s
AppearanceHeavy6724@reddit
is not 1/10 of 3090
TitoGrande1980@reddit
Nvidia cant export to china.
Its going to be the US that cant import enough from china.
wowsers7@reddit
A-hole Trump will slap 400% tariffs on them to prevent competition.
Interstate82@reddit
Blah, call me when it can run Crysis in max quality
CarsonWentzGOAT1@reddit
This has similar performance to a 4090 with 96gb of VRAM
Anidamo@reddit
It has less than half the memory bandwidth of a 4090.
mxmumtuna@reddit
I don’t think that’s true. It uses LPDDR4X (~400GB/s), and also has no meaningful compute power. It would perform similarly to a Mac for inference.
Interstate82@reddit
Can it actually run games though? From my quick googling it seems the Huawei cards are made for running AI, with no apparent support for games...
fallingdowndizzyvr@reddit
No. This is a NPU. Not a GPU.
fallingdowndizzyvr@reddit
No it doesn't. Not at all. Why do you say that?
untanglled@reddit
lpddr4 ram lol, strix halo would be so much better option
Hytht@reddit
the actual bandwidth and bus width matters more for AI more than if it's LPDDR or GDDR
untanglled@reddit
lpddr4 has limits, i belive maximum lpddr4 can get is around 8000 mts. and we dont know about the number of channels but i have no reason to belive this has more than quad channel. basically worse version of strix halo
AlxHQ@reddit
It's supported by llama.cpp?
fallingdowndizzyvr@reddit
Yes.
SadWolverine24@reddit
Anyone have inference benchmarks?
fallingdowndizzyvr@reddit
The 300I is not new. Contrary to the title of this thread. Go baidu and you'll find plenty of reviews of it.
zazzersmel@reddit
"unchallenged monopoly abuse"
Minato-Mirai-21@reddit
Don’t you know the orange pi ai studio pro? The problem is they are using lpddr4x.
Impressive_Half_2819@reddit
Well long time coming.
Rukelele_Dixit21@reddit
What about CUDA support ? In order to train models can this be used or is it just for inference ?
dissian@reddit
Huawei lasts over 3 hours too and collects information on all your queries. Nbd.
tat_tvam_asshole@reddit
The funny part is Nvidia can't sue Huawei if they fund zluda of some other drop in cuda alternative for their hardware.
PotatoTrader1@reddit
Hey stfu its a big portion of my portfolio
Resolve_Neat@reddit
Lets hope it continue this way, and maybe in 3 to 5 years we could get "nowadays high end" consumer gpu for a decent price! Because having to pay 700 to 1200€ for a rtx 3090 overused for crypto and AI is crazy...
CochainComplexKernel@reddit
Has anyone some experience to use them under linux? they also have cheaper smaller cards
o5mfiHTNsH748KVq@reddit
inb4 US government says they're backdoored
GoodRazzmatazz4539@reddit
It’s not about the hardware, it’s the software that makes the Monopol.
seppe0815@reddit
first hardware ... next weeks sweet driver will come ... long life china !!!!
Imunoglobulin@reddit
What kind of website is this?
GreatBigJerk@reddit
An online store. What is weird about it?
TexasPudge@reddit
Looks like JD inc. NASDAQ ticker: JD
AFruitShopOwner@reddit
No cuda
slpreme@reddit
no party
Used_Algae_1077@reddit
Damn China is cooking hard at the moment. First AI and now hardware. I hope they crush the ridiculous Nvidia GPU prices
HoboSomeRye@reddit
lessssgoooooooo
ismellthebacon@reddit
so, are they stealing nvidia chips and rebranding them? They'll have working drivers to work with all the major AI frameworks?
krste1point0@reddit
It's Huawei. They make their own chips.
Zeikos@reddit
Damn, this might make me reconsider the R9700.
The main concern would be software support, but I would be surprised if they don't manage ROCm or Vulkan, hell they might even make them CUDA compatible, I wouldn't be surprised.
Ok_Cow_8213@reddit
I hope it lowers demand on Nvidia and AMD GPU’s so that it lowers their price.