[-]

atape_1@reddit

Do we have any software support for this? I love it, but I think we need to let it cook a bit more.

[-]

fallingdowndizzyvr@reddit

CANN has llama.cpp support.

https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/CANN.md

[-]

ReadySetPunish@reddit

So does Intel SYCL but is still not nearly as optimized as CUDA, with for example graph optimizations being broken. Support alone doesn’t matter.

[-]

fallingdowndizzyvr@reddit

Yes, and as I have talked myself blue about. Vulkan is almost as good or better than CUDA and ROCm. There is no reason to run anything but Vulkan.

[-]

ReadySetPunish@reddit

I don't agree. Performance on NVIDIA GPUs is a lot better on CUDA than Vulkan, at least for llama.cpp. Besides, mainline PyTorch doesn't really support anything but cuda

[-]

fallingdowndizzyvr@reddit

I don't agree. Performance on NVIDIA GPUs is a lot better on CUDA than Vulkan, at least for llama.cpp

You're wrong.

6 months ago Vulkan got real close to CUDA.

https://www.reddit.com/r/LocalLLaMA/comments/1j1swtj/vulkan_is_getting_really_close_now_lets_ditch/

4 months ago, Vulkan got faster than CUDA in llama.cpp.

https://www.reddit.com/r/LocalLLaMA/comments/1kabje8/vulkan_is_faster_tan_cuda_currently_with_llamacpp/

Vulkan has gotten even faster since.

[-]

ReadySetPunish@reddit

The post you mentioned has the OP turn flash attention off. It's cherry picking.

[-]

fallingdowndizzyvr@reddit

How's that? It's off for both CUDA and Vulkan. Vulkan supports FA too. Level playing field.

[-]

hyperparasitism@reddit

To be fair, most people will have flash attention on. If CUDA outperforms Vulkan with flash attention on, then Vulkan still has some catching up to do

[-]

Definitely not true. People bought them because they were a cheap computer. That was literally their whole thing. So every child could have a computer to tinker with.

I've owned raspberry pis for over a decade now.

emprahsFury@reddit

That has nothing to do with the purported difficulty training on Huawei Ascend's which allegedly broke R2's timeline and caused Deepseek to switch back to Nvidia. And if we were to really think about it- DS wouldnt be switching to Huawei in August 2025, if they hadn't abandoned Huawei in in May 2025.

[-]

RuthlessCriticismAll@reddit

In your world Deepseek switched to Huawei in like April or whatever then abandoned it in May and then switched back in August. This is obviously false.

[-]

BowlCutKing@reddit

This is obviously false.

Careful, talking so confidently.

https://arstechnica.com/ai/2025/08/deepseek-delays-next-ai-model-due-to-poor-performance-of-chinese-made-chips/

[-]

RuthlessCriticismAll@reddit

You think they switched back and forth every few months. That is completely idiotic. Try to use your head just a little.

[-]

BowlCutKing@reddit

You think they didn't try? The hw and sw was too shit, simple.

Don't make me laugh. Smic 7nm duv vs 4nm tsmc blackwell, 3nm tsmc rubin. Ask intel how much fab nodes matter.

Thanks for answering my question, on semiconductors you are an idiot.

[-]

fallingdowndizzyvr@reddit

AttitudeImportant585@reddit

eh, its evident how big of a gap there is between amd and nvidia/apple chips in terms of community engagement and support. its been a while since i came across any issues/pr for amd chips

[-]

iboughtarock@reddit

Tinygrad is trying to solve this exact problem.

[-]

am0x@reddit

Exactly. Even huang says they are more of a software company than a hardware one.

It’s the code that makes these things work so well.

[-]

kmouratidis@reddit

Yes, but if you can find me 1 happy Nvidia user who uses more than two different-generation GPUs, I'll show you a liar.

[-]

ROOFisonFIRE_usa@reddit

Say it ain't so. I was hoping I wouldnt have issues pairing my 3090's with something newer when I had the funds.

[-]

michaelsoft__binbows@reddit

No idea what that guy is on about

[-]

Ilovekittens345@reddit

I have never had any issues with my nvidia graphics card. The latest one I got was a 3080 ti. So far in my life I have had 12 cards starting with a Gforce 2. I have also had 2 AMD cards, and their hardware is fine but I always have issues with their drivers and the software that controls the settings. With nvidia, if a new driver has issues you rollback then wait for the next one, they get released fairly often. With AMD ... i don't even want to talk about it.

This has always been the main reason people pick Nvidia over AMD. All the extra tools and software you get access to with Nvidia even if the hardware between the two is of the same speed and cost the same. And the stability and support of their drivers.

[-]

BoeJonDaker@reddit

Maybe just talking about AI. I used Fermi, Kepler and Pascal for 3D rendering and they worked fine together.

[-]

If you knew what you really have it wouldn't be a problem, look Nvidia P40 (pascal do not support async computing like other dx 12 GPUs due to flaw in it's architecture and need to emulate it for dx12 games, same for Maxwell based GPUs.) If you had atleast turing based GPU instead of that P40 I believe you could get it working on all 4 cards.

[-]

ROOFisonFIRE_usa@reddit

I see. This was what I was worried about. I figured unbatched non tensor parallel should work, but it doesn't surprise me that training / Tensor parallel aren't a cake walk.

-

Similar here, and GT630 was the first GPU I tried using for neural network training, 1080 (trained a few here), 3070 (ti?), 4070 ti, 4x3090 server, 5070 ti, 5060 ti.

Ok_Run_101@reddit

Thanks, got it. Windows/Linux compatibility is a nightmare so I understand the pain.
It sucks that the AI GPU world is being cornered into a monopoly. Hopefully some innovations around CUDA compatibility come up in the future, in the software layer.

[-]

Coders_REACT_To_JS@reddit

Considering just how much money it could save/make, I bet this situation improves (or at least people try to improve it).

[-]

Ok_Run_101@reddit

knock on wood!

[-]

ChloeNow@reddit

100%

ATI (For youngin's, AMD was split into AMD/ATI back in the day with ATI making the cards) was a joke back in the day until they started getting their driver game on point.

They WERE always behind AF. Then their hardware started catching up and actually being faster in a lot of cases... it was just a matter of getting it to actually work. Meanwhile NVidia drivers just fucking worked, every time, flawless.

Now, though? There's a lot of complex shit going on that wasn't back in the day. Cards are more complex and with that comes more potential for mess-ups, so as AMD moved towards better drivers, NVidia was getting further from being able to make those 100% stable drivers.

The playing field almost leveled out until the AI race kicked up.

Pvt_Twinkietoes@reddit

R2 has been delayed because they want to train on Chinese chips right? Might be these.

[-]

driver

Here are the specs that everyone is interested in:

Huawei Atlas 300V Pro 48GB
https://e.huawei.com/cn/products/computing/ascend/atlas-300v-pro
48GB LPDDR4x at 204.8GB/s
140 TOPS INT8
70 TFLOPS FP16

Huawei Atlas 300i Duo 96GB
https://e.huawei.com/cn/products/computing/ascend/atlas-300i-duo
96GB or 48GB LPDDR4X at 408GB/s, supports ECC
280 TOPS INT8
140 TFLOPS FP16

PCIe Gen4.0 ×16 interface
Single PCIe slot (!)
150W

For reference the 3090 does 284 TOPS INT8, 71 TFLOPS FP16, and 936 GB/s memory bandwidth

Linux drivers:
https://support.huawei.com/enterprise/en/doc/EDOC1100349469/2645a51f/direct-installation-using-a-binary-file
https://support.huawei.com/enterprise/en/ascend-computing/ascend-hdk-pid-252764743/software

And the post title hilariously compared this to rtx pro 6000...

Band Width :1.6 Tbit/s Bus Width :512-bit Memory Technology :GDDR7 SDRAM

LOL

And why not compare this to a 5090 instead of a 3090 which was released 5 years ago? Bandwidth :1.7 Tbit/s

I give Huawei an A for effort. I give this post title and any Blackwell comparison an F.

[-]

sassydodo@reddit

Localllama is full of bitter haters man

[-]

DistanceSolar1449@reddit

Why are you comparing it to the 5090? This GPU was released in 2022.

https://support.huawei.com/enterprise/en/doc/EDOC1100285916?idPath=23710424%7C251366513%7C22892968%7C252309139%7C252823107

I'm really curious how they achieved that FLOPS level with only 150W TDP. The 3090 is 350W, and granted it's on an older Samsung 8nm process, I don't think Huawei has access to a much better process than that for manufacturing.

Its already on ebay for $4000

I'm in Canada and ordering it from Alibaba is $2050 cdn including shipping. 🙂✌️. God Bless Canada ! 🥳

[-]

Spectrum1523@reddit

only $2k CDN for a card with worse performance than a 3090, sweet

[-]

Shadowarchcat@reddit

Brain dead comment. Performance of 3090.. yeah but memory of A100 40.000€ card

[-]

Spectrum1523@reddit

I can buy 96gb of ddr for like $100 so what

[-]

Shadowarchcat@reddit

Next brain dead comment. Keep going bro I need to see how dumb you can get.

[-]

Recoil42@reddit

Do you live in the United States? Chinese cars are effectively banned in the United States.

FaceDeer@reddit

You think those are the only companies using AI?

[-]

3000LettersOfMarque@reddit

Hauwei might be difficult to get in the US given in the first term they were banned both base stations, network equipment and most phones at the time from being imported for use in cellular networks for the purposes of national security

Given AI is different yet similar the door might become shut again for similar reasons or just straight up corruption

[-]

Siddharta95@reddit

National security = Apple earnings

[-]

apodicity@reddit

I dare say it's time to make some apple pie.

[-]

Don't say "voluntary transactions" in this context unless u want people to look at u askance. Just say "transactions". An "involuntary transaction" is referred to as a "forced sale".

"A forced sale is an involuntary transaction in which the sale is based upon legal and not economic factors, such as a decree, execution, or something different than mere inability to maintain the property. If the sale is made for purely economic reasons, it is considered voluntary."

See what I mean?

AnduriII@reddit

Luckily i'not in the US🤗

[-]

LPDDR4x

From their official website:

[-]

CreativeDimension@reddit

BandNarrow?

[-]

NickCanCode@reddit

400/s

[-]

shaq992@reddit

LPDDR4X

https://e.huawei.com/cn/products/computing/ascend/atlas-300i-duo

[-]

ttkciar@reddit

Interesting .. compute performance about halfway between an MI60 and MI100, but at half of the bandwidth, but oodles more memory.

Seems like it might be a good fit for MoE?

LPDDR4X 48 GB, total bandwidth 204.8 GB/s, supports ECC

AI computing power

CPU computing power

8 core 1.9 GHz

Codec capabilities

Support H.264/H.265 hardware decoding, 128-channel 1080P 30 FPS

(16 channels, 3840*2160, 60 FPS)

Support H.264/H.265 hardware encoding, 24-channel 1080P 30FPS

(3-channel 4K 60FPS)

JPEG decoding capability 4K FPS, encoding capability 4K 192 FPS,

Maximum resolution: 8192*8192

PCIe interface

PCIe x16 Gen4.0

Maximum power consumption

72W

Working environment temperature

0℃~55℃ (32°F~131°F)

Structure size

[-]

CeFurkan@reddit (OP)

More companies will come too

[-]

Due_Investigator3288@reddit

People try to compress these things down to one parameter and are then surprised why their cheap hardware doesn’t perform. Memory is only a small part of the problem. Huawei cards can’t do double and float math. At that point even a raspberry Pi can defeat this card in some tasks. Even for training you still want 32 bits for some tasks. Afaik the huawei cards can’t do that.

[-]

Nexter92@reddit

If it's the same performance as RTX 4090 speed with 96GB, what a banger

[-]

GreatBigJerk@reddit

It's not. It's considerably slower, doesn't have CUDA, and you are entirely beholden to whatever sketchy drivers they have.

There are YouTubers who have bought other Chinese cards to test them out, and drivers are generally the big problem.

I might be wrong, but I interpret 8P8OoBz's comment as playing on the words "power hungry". Yes we're talking about resources not people, so a bit of a stretch but hm. Just giving what I believe they are referring to.

[-]

cockerspanielhere@reddit

You're getting down voted by ignorants. China's energy balance is -400 billion USD per year

[-]

Sure, and I hope Huawei does well. They probably will make a good consumer GPU eventually.

This isn't one due to the memory bandwidth though.

[-]

somepotato5@reddit

I mean, we did start buying phones from Apple 2 decades ago, and look what happened.

[-]

LettuceElectronic995@reddit

lets see

[-]

ChloeNow@reddit

I give it 6 months before the drivers are up to par, a year before it's basically the equivalent of AMD, and a year and a half before it's on-par with NVidia, AI support and all.

I would say longer for AI support, but I see a lot of open-source tools have started loosely supporting (or trying to support) AMD, and once all these apps have modularized the parts of the code dealing with the specific hardware architecture, supporting a new card will be much faster than that initial battle.

[-]

pier4r@reddit

doesn't have CUDA

what blows my mind, or better blows the AI hype is exactly the software advantage of some products.

For the hype we have on LLMs, it feels like (large) companies could create a user friendly software stack in few months (to a year) and to close the SW gap to nvidia.

CUDA having years of advantage creates a lot of tools and documentation and integrations (i.e. pytorch and what not) that gives nvidia the advantage.

With LLMs (with the LLM hype that is) one in theory should be able to reduce the gap a lot.

And yet the reality is that neither AMD or others (that have even less time spent on the matter than AMD) can close that gap quickly. This while AMD or chinese firms aren't exactly lacking in resources to use LLMs. Hence the LLMs are useful but not yet that powerful.

[-]

Pruzter@reddit

lol, if LLMs could recreate something like CUDA we would be living in the golden age of humanity, a post scarcity world. We are nowhere near this point.

LLMs struggle with maintaining contextual awareness for even a medium sized project in a high level programming language like Python or JS. They are great to help write small portions of your program in lower level languages, but the lower level the language, the more complex and layered the interdependencies of the program become. This translates into requiring even more contextual awareness to effectively program. AKA we are a long way off from LLMs being able to recreate something like CUDA without an absurd number of human engineering hours.

[-]

wuu73@reddit

ummmm... did you know that they literally solved protein folding? they've invented new medications, but i mean, go ahead and think that, more compute for me lol

[-]

Pruzter@reddit

I am like the most pro LLM person there is, I burn millions of tokens a day using these things for programming. It’s how I know exactly where the current leverage and pain points are

[-]

pier4r@reddit

lol, if LLMs could recreate something like CUDA we would be living in the golden age of humanity, a post scarcity world. We are nowhere near this point.

I am not saying that, not they do it on their own like AGI/ASI.

Rather that they can help devs so much, that the devs speed up and narrow the gap. But that doesn't happen either. So LLMs are helpful but not that powerful. As you well put, as soon as the code becomes tangled in dependencies, they cannot handle it. Even if it fits their context window.

Much as I like AI, they aren't fit for prime time. You would think that people wealthy enough to own a company, would try out AI themselves and learn whether it is fit for purpose.

[-]

What I've noticed is the more technical the code is, the more terrible the LLM is. It's great and very strong when I'm writing something in a new language I'm learning, and it can explain things pretty well.

Getting it to help me debug something in languages I've had years of experience in, and it's pretty useless.

I'm guessing "join hardware and software to replicate cutting edge super complex system" with LLM's will at best be an exercise in frustration.

[-]

Ensiferum@reddit

"In a few months to a year"

My man has clearly never worked in enterprise level software development. Whatever you think the complexity of such a project is, multiply it by 5.

[-]

pier4r@reddit

I do. That's the point. If the hype would be real, "augmented" dev under pressure by management could narrow the gap in a short time. Maybe a year is short, then two.

The point being, the hype is far away from reality.

I thought it was clear that I was shrinking the needed time by a lot to match the hype and disprove it.

[-]

yogthos@reddit

there's work being done here already https://www.tomshardware.com/pc-components/gpus/chinas-moore-threads-polishes-homegrown-cuda-alternative-musa-supports-porting-cuda-code-using-musify-toolkit

[-]

BusRevolutionary9893@reddit

Chinese hardware manufacturers usually only target and test on the hardware/software configs available in China.

There are also Chinese hardware manufacturers like Bambu Labs who basically brought the iPhone equivalent of a 3D printer to the masses. Children can download and print whatever they want right from their phone. From hardware to software, it's an entirely seamless experience.

[-]

GreatBigJerk@reddit

That's a piece of consumer electronics, different from a GPU.

A GPU requires drivers that need to be tested on an obscene number of hardware combos to hammer out the bugs and performance issues.

Also, I have a Bamboo printer that was dead for several months because of the heatbed recall, so it's not been completely smooth.

[-]

BusRevolutionary9893@reddit

Um... A consumer GPU is a consumer electronic. Your right that GPU drivers will take more testing on different hardware configurations but there is also a ton more money to be made with GPUs than 3D printers.

I've never had an issue with my P1S and two AMSs. You're not giving them their due credit for changing the market from hobbyist who like tinkerer with 3D printers to grandmothers with no technical experience being able to make crafts to sell at fairs.

[-]

GreatBigJerk@reddit

I didn't say the printer was bad... I love mine. That's why I waited for a replacement bed instead of getting a refund. It's just not flawless.

Anyway it seems like we agree on the GPU thing.

[-]

Let them cook. Monopoly is horrible

[-]

simracerman@reddit

Care to share some sources?

[-]

Yep, you're right. For some stupid reason I got Yen and Yuan mixed up. Appreciate the correction.

*learned something new today*

Still better than cpu and potentially 3090. Probably also a lot of drivers/software inefficiencies, too.

And tbh 75% of 4090 with 4x vram in one pcie is a great deal anyway. Hope there will be proper driver support.

[-]

The AI Max+ 395 has half the memory bandwidth

The Max+ 395 has more memory bandwidth. This is a dual card. It has 2xGPUs at 204GB/s. Contrary to the spec sheet claim, that does not add up to 408GB/s. Also, the Max+ 395 has more compute.

with this card you can stick 6-7 of them onto a Threadripper Pro mainboard.

Low_Cow_6208@reddit

Fk China. The only thing I like less then our bastard monopoliet is a totalitarian government monopoly that will betray a second they got their political goal achieved

[-]

apodicity@reddit

It doesn't matter wtf they do domestically from an international perspective because other players are always free to enter the market. People will keep buying from China so long as it's cheaper.

[-]

lightningroood@reddit

meanwhile the chinese are busy smuggling nvidia gpus

[-]

Prinzmegaherz@reddit

Didn’t Trump need to lift those export controls after the lost trade war?

[-]

Purple_Errand@reddit

Since when USA lost trade war? Lmao

People really just read what's on the top but never really do search. I'm not from USA, but stop riding the same waves, Brother.

[-]

CUDA is a wall, but the fact that nobody else has shipped competitive cards at a reasonable price in reasonable quantities is what's prevented anyone from fully knocking down that wall.

Today, llama.cpp (and some others) works well enough with Vulkan that if anyone can ship hardware that supports Vulkan with good price and availability in the > 64GB VRAM segment CUDA will stop mattering within a year or so.

[-]

gmdtrn@reddit

Many would argue that AMD does already. Not this good, but indeed the issue is CUDA. That edge NVIDIA has now won't last forever though. And, I hope it's sooner than later that they have to start playing nice.

[-]

Tai9ch@reddit

For ML, AMD doesn't really have any competitive offerings until you get up to the new server stuff (e.g. Radeon Instinct 300 series).

I really want their stuff to be good. I've been looking for a decent new ML card that isn't Nvidia for the past year. But AMD just won't do it. They won't significantly beat Nvidia on RAM. They won't break from their "$50 cheaper than Nvidia" price curve. And they won't take compute drivers seriously for gaming cards.

And AMD has the further problem that they're really competing not with new Nvidia cards but with used Nvidia cards. Would you rather have a RTX 3090 for $800 or an Radeon 7900 XTX for $1000? They've got the same VRAM and the cheaper one supports CUDA while the other one doesn't.

The server card market almost seems worse. AMD has several old cards that don't even seem to exist used, presumptively because they never really existed or sold new.

[-]

Galactic_Neighbour@reddit

To be fair, AMD is sometimes cheaper or sometimes has more VRAM, but sometimes it's both. The RX 7900 XTX had 24 GB while its competitor at the time RTX 4080 only had 16 GB. This has been a thing for a few generations now, but I feel like hardly anyone ever notices. In the current generation RX 9070 has 16 GB, but RTX 5070 only has 12 GB (and it's also more expensive and uses more power). But yeah, the software support in ROCm is terrible sometimes and the progress is slow. The RX 9070 is probably still not fully supported.

Would you rather have a RTX 3090 for $800 or an Radeon 7900 XTX for $1000?

Yeah, no.

They have the fastest GPUs out there with the most VRAM. Cuda is a very shallow moat.

And that stuff is developed. Why do you think CUDA is unique? There are plenty of APIs. In this case, there's CANN.

Again, it's the hardware GPU that made Nvidia what it is. Not CUDA. There are plenty of alternatives to CUDA.

[-]

Khipu28@reddit

If it’s so easy to develop then why does everyone struggle with it? Profilers for example AMD builds very potent hardware in many aspects it’s better than Nvidia but they fail to deliver in certain areas because there is no good profilers. AMD heavily relies on Sony to fill that gap for their games sector but there is nothing available for AI devs. Building that Software ecosystem is hard and it takes years to catch up.

[-]

fallingdowndizzyvr@reddit

If it’s so easy to develop then why does everyone struggle with it?

Who struggles? Again there are plenty of APIs. You are mistaken in thinking that Nvidia is fast because of CUDA. Nvidia is fast because they make great hardware.

Profilers for example AMD builds very potent hardware in many aspects it’s better than Nvidia but they fail to deliver in certain areas because there is no good profilers.

No. AMD, and Intel for that matter, have good hardware claims on paper. But in reality, they fail realize it.

AMD heavily relies on Sony to fill that gap for their games sector but there is nothing available for AI devs.

Awkward-Candle-4977@reddit

Ai uses tensorrt on the tensor core. Nvidia is able to make multi gpu cluster (really) works over lan

[-]

night0x63@reddit

lots of apple MLX people with full support for AI stuff even tho small market lol. so there's hope. specifically the $10k apply 512GB machines.

[-]

knight_raider@reddit

Spot on and that is why AMD could never give a fight. The chinese developers may find the cycles to optimize it for their use case. So lets see how this goes.

[-]

Salty_Flow7358@reddit

100%. But we shouldnt lose hope.

[-]

Phyzzx@reddit

It's good news that we have another player in the market like NBC.

[-]

recoverygarde@reddit

No point with M4 (current) and M5 macs about to drop

[-]

paul_tu@reddit

I wonder what software stacks does it support

Weary-Wing-6806@reddit

These Huawei Atlas 300i Duo cards aren’t new tho. They’re 2022 datacenter pulls with 96GB LPDDR4x and low bandwidth. They’re fine for cheap inference where memory matters more than speed, but nowhere near a 4090 for performance or training. The bigger problem is software... CUDA dominates, while Huawei’s stack is still rough with driver issues and limited support. They look cheap on Alibaba, but imports get messy and prices double on eBay. Basically, lots of VRAM for little money, but you trade off speed and stability.

[-]

Unlikely_Ad1890@reddit

monopoly /mə-nŏp′ə-lē/ noun Exclusive control by one group of the means of producing or selling a commodity or service.

The GPU market is a group of companies that work together to line their pockets, they'd be considered an oligopoly

[-]

Western_Building_880@reddit

Watch us out tareifs.

[-]

TheL0ckman@reddit

Hey that and you'll be able to complain that is just a rebadged older device later.

[-]

ProjectPhysX@reddit

This is a dual-CPU card - 2x 16-core CPUs with 48GB dog-slow LPDDR4X @ 204 GB/s, and some AI acceleration hardware. $2000 is still super overpriced for this.

Nvidia RTX Pro 6000 is a single GPU with 96GB GDDR7 @ 1.8 TB/s, a whole different ballpark.

[-]

PraxicalExperience@reddit

The RTX Pro 6000 is also 4x the price...

[-]

Pulselovve@reddit

What are the expected performances with gpt-oss?

[-]

rotatingphasor@reddit

What software stack does this work on? I imagine it'd be difficult to get it working on things like Pytorch.

Actually just checked
https://pytorch.org/blog/huawei-joins-pytorch/

They don't. Does not run on Windows, nor does llama.cpp support CANN.

Just tell me how these card are doing when compared to AMD 128GB Ryzen Max AI Which is roughly the same price but as a complete PC with AMD software stack.

[-]

Emergency-Author-744@reddit

Basic spec comparisons

Atlas 300i vs AMD 395+ AI Max 408 GB BW vs 256 GB BW (60% better TPS theoretical) 140 TFlops fp16 vs 59 TFlops fp16 (240% faster PPS theoretically, though this is very software dependent) 96GB LPDDR4X vs 128GB LPDDRX5 (96 VRAM max allocation)

You also need the compute power and drivers.

They are not here for the moment.

[-]

Honestly thank FUCKING god.

There is no monopoly abuse.

The China GPU is far inferior in hardware. And it’s 96 GB because it’s using LPDDR4X.

[-]

FearThe15eard@reddit

Lets go i can finally build my PC

[-]

nenulenu@reddit

Not to drag politics in, but seems necessary. This is just a sign that Trump is right and US should have been careful with manufacturing high technology in China, which reversed engineered and got his engineers educated in the US to get more than enough skill as import. Now, they are setup enough produce practically anything at a fraction of the price Should drive competition.

The question is going to be how much buyers are going to take with these products. May be they will be like UK and take massive risks for cheaper pricing. May be they will be more careful in understanding long term risk.

AI is the most important strategic asset any country has going forward.

How so? Or is it just what they all believe?

[-]

Better-Cricket-7883@reddit

all the world

[-]

MedicalScore3474@reddit

Or maybe they will deny export because they don't want the US to catch up...

They've already done this with Huawei Ascend processors: https://www.huaweicentral.com/us-imposing-stricter-rules-on-huawei-ai-chips-usage-worldwide/

If they are considered to be produced in violation of US export controls like the Ascend processors, US citizens will not be allowed to use them or buy them anywhere in the world, else you will face criminal and civil penalties.

[-]

Alihzahn@reddit

Because there's so much free speech happening in the US currently. I'm no CCP shill, despise them even. But it's actually funny seeing people call out China when people are getting arrested left and right for free speech in the west. And the upcoming draconian spying laws.

[-]

sailee94@reddit

Actually, this card came out about three years ago. It’s essentially two chips on a single board, and they work together in a way that’s more efficient than Intel’s dual-chip approach. To use it properly, you need a specialized PCIe 5.0 motherboard that can split the port into two x8 lanes.

In terms of performance, it’s not necessarily faster than running inference on CPUs with AVX2, and it would almost certainly lose against CPUs with AVX512. Its main advantage is price, since it’s cheaper than many alternatives, but that comes with tradeoffs.

You can’t just load up a model like with Ollama and expect it to work. Models have to be specially prepared and rewritten using Huawei’s own tools before they’ll run. The problem is, after that kind of transformation, there’s no guarantee the model will behave exactly the same as the original.

If it could run CUDA then that would have been a totally different story btw..

[-]

Is this legit? In Taobao I find them for about 9000 RMB. That seems quite cheap to me. I just went to Shenzhen to find a good vendor for the 4090 48gb. But these Huawei cards are crazy

[-]

ThePi7on@reddit

Competition is always welcome

[-]

juggarjew@reddit

So what? It does not matter if it can not compare to anything that matters. The speed has to be useable. Might as well just get a refurb Mac for $3000 with 128GB RAM.

[-]

thowaway123443211234@reddit

Everyone comparing this to the Strix misses the point of this card entirely, the two important things are:

This form factor scales for large scale inferencing for full fat frontier models.
Huawei have entered the GPU market which will drive competition and GPU prices down. AMD will help but Huawei will massively accelerate the decrease in price

But now look at this ....

[-]

fantom1252@reddit

Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08

But now look at this ....

Overview

This document describes how to install and uninstall software packages and provides FAQs and troubleshooting methods.

This document applies to:

Atlas 300I Pro inference card
Atlas 300V Pro video analysis card
Atlas 300I Duo inference card
Atlas 300V video analysis card

Intended Audience

This document is intended for:

Enterprise administrators
Enterprise device users

Symbol Convention

Symbols that may be found in this document are defined as follows:


Indicates a hazard with a high level of risk which, if not avoided, could result in death or serious injury.
Indicates a hazard with a medium level of risk which, if not avoided, could result in death or serious injury.


Indicates a hazard with a low level of risk which, if not avoided, could result in minor or moderate injury.
Indicates a potentially hazardous situation which, if not avoided, could result in equipment damage, data loss, performance deterioration, or other unanticipated results. NOTICE is used to address practices not related to personal injury.


Supplements the important information in the main text. NOTE is used to address information not related to personal injury, equipment damage, and environment deterioration.

just wonder why is that written there ? what kind of a material they have used there its interesting .....

[-]

fantom1252@reddit

Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08

But now look at this ....

Overview

This document describes how to install and uninstall software packages and provides FAQs and troubleshooting methods.

This document applies to:

Atlas 300I Pro inference card
Atlas 300V Pro video analysis card
Atlas 300I Duo inference card
Atlas 300V video analysis card

Intended Audience

This document is intended for:

Enterprise administrators
Enterprise device users

Symbol Convention

Symbols that may be found in this document are defined as follows:


Indicates a hazard with a high level of risk which, if not avoided, could result in death or serious injury.
Indicates a hazard with a medium level of risk which, if not avoided, could result in death or serious injury.


Indicates a hazard with a low level of risk which, if not avoided, could result in minor or moderate injury.
Indicates a potentially hazardous situation which, if not avoided, could result in equipment damage, data loss, performance deterioration, or other unanticipated results. NOTICE is used to address practices not related to personal injury.


Supplements the important information in the main text. NOTE is used to address information not related to personal injury, equipment damage, and environment deterioration.

just wonder why is that written there ? what kind of a material they have used there its interesting .....

[-]

fantom1252@reddit

Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08

Overview

This document describes how to install and uninstall software packages and provides FAQs and troubleshooting methods.

This document applies to:

Atlas 300I Pro inference card
Atlas 300V Pro video analysis card
Atlas 300I Duo inference card
Atlas 300V video analysis card

Intended Audience

This document is intended for:

Enterprise administrators
Enterprise device users

Symbol Convention

Symbols that may be found in this document are defined as follows:


Indicates a hazard with a high level of risk which, if not avoided, could result in death or serious injury.
Indicates a hazard with a medium level of risk which, if not avoided, could result in death or serious injury.


Indicates a hazard with a low level of risk which, if not avoided, could result in minor or moderate injury.
Indicates a potentially hazardous situation which, if not avoided, could result in equipment damage, data loss, performance deterioration, or other unanticipated results. NOTICE is used to address practices not related to personal injury.


Supplements the important information in the main text. NOTE is used to address information not related to personal injury, equipment damage, and environment deterioration.

just wonder why is that written there ? what kind of a material they have used there its interesting .....

[-]

fantom1252@reddit

Good for us programmers and for customers ... now have you ever seen the page ? Atlas Center Inference Card 23.0.3 (and Later) NPU Driver and Firmware Installation Guide 08

https://support.huawei.com/enterprise/en/doc/EDOC1100349483?idPath=23710424|251366513|22892968|252309139|252823107?

Overview

This document describes how to install and uninstall software packages and provides FAQs and troubleshooting methods.

This document applies to:

Atlas 300I Pro inference card
Atlas 300V Pro video analysis card
Atlas 300I Duo inference card
Atlas 300V video analysis card

Intended Audience

This document is intended for:

Enterprise administrators
Enterprise device users

Symbol Convention

Symbols that may be found in this document are defined as follows:


Indicates a hazard with a high level of risk which, if not avoided, could result in death or serious injury.
Indicates a hazard with a medium level of risk which, if not avoided, could result in death or serious injury.


Indicates a hazard with a low level of risk which, if not avoided, could result in minor or moderate injury.
Indicates a potentially hazardous situation which, if not avoided, could result in equipment damage, data loss, performance deterioration, or other unanticipated results. NOTICE is used to address practices not related to personal injury.


Supplements the important information in the main text. NOTE is used to address information not related to personal injury, equipment damage, and environment deterioration.

Danger https://support.huawei.com/enterprise/en/doc/EDOC1100349483?idPath=23710424|251366513|22892968|252309139|252823107

just wonder why is that written there ? what kind of a material they have used there its interesting .....

[-]

pmv143@reddit

This is pretty wild that Huawei is putting out a 96GB card for under $2K. Since it’s built for inference, it makes sense the raw throughput isn’t on par with a 4090 , but its a steal for memory-heavy workloads.

Might want to calm down first

Here's the llamacpp documentation on CANN as per another comment:

Ascend NPU is a range of AI processors using Neural Processing Unit. It will efficiently handle matrix-matrix multiplication, dot-product and scalars.

CANN (Compute Architecture for Neural Networks) is a heterogeneous computing architecture for AI scenarios, providing support for multiple AI frameworks on the top and serving AI processors and programming at the bottom. It plays a crucial role in bridging the gap between upper and lower layers, and is a key platform for improving the computing efficiency of Ascend AI processors. Meanwhile, it offers a highly efficient and easy-to-use programming interface for diverse application scenarios, allowing users to rapidly build AI applications and services based on the Ascend platform.

satireplusplus@reddit

From their official website: LPDDR4X 96GB or 48GB, total bandwidth 408GB/s Support for ECC.

Really crap bandwidth + crap performance (300 tops at most)... yeah... if you could somehow use the vram on that card and use a 5090 gpu core + vram for the AI or something, thatd be freaking amazing.

Atlas 300V Pro 48GB LPDDR4X 204.8 GB/s 140 TOPS INT8 / 70 TFLOPS FP16 150W Single slot PCIe

[-]

Musicheardworldwide@reddit

Really not versed on the subject, but isn't the whole thing was nvidia the software? Isn't cuda a factor in this because it's so widely used?

[-]

e79683074@reddit

Competition is healthy, but what ensures me I'm not putting hardware spyware in my build? This thing literally has direct RAM access and the country is well known for putting backdoors in things.

[-]

chinese__investor@reddit

how much spyware in nvidia and app0le?

[-]

Striking-Warning9533@reddit

Yeah, as a Chinese I am paranoid and will avoid this.

[-]

shibe5@reddit

As for the hardware, when you have spyware in CPU (AMD PSP) or MB (Intel ME), might as well go with bugged NPU.

As for the software, I'll wait for reviewed open source drivers and libraries.

[-]

xxPoLyGLoTxx@reddit

Just disable the GPU in your windows firewall /s

[-]

BlueArcherX@reddit

you guys are easy marks

[-]

m1013828@reddit

a for effort, big ram is usefull for local AI, but the performance.... i think id wait for next gen with even more ram on lpddr5x and at least quadruple the TOPS, a noble first attempt

[-]

EpicOfBrave@reddit

Good luck convincing US and European software companies, and especially the governments, to support Chinese hardware!

USA is accountable for 95% of the global AI spending.

🐻🍯

[-]

kaggleqrdl@reddit

Pretty sure if you follow the money the powers that be control both NVidia and AMD and don't want them to compete.

[-]

stumblinbear@reddit

AMD put Intel out of business but apparently they don't want to compete with Nvidia for Reasons™

Wild take

[-]

_bachrc@reddit

Didn't Deepseek said they had issues with Huawei cards, and that it caused their multiples delays on R2?

[-]

farnoud@reddit

The entire software ecosystem is missing. Not a hardware problem.

It's not made for gaming, it doesn't even have it's own cooling. It's made for servers to run AI inference. But if cheaper alternatives become available for AI less gaming capable GPU's will be needed for AI. It's indirect competition for gaming GPU's and it could push prices for gaming GPU's lower if it lowers demand for gaming GPU's for AI purposes.

AI is pushing up prices for gaming GPU's now like etherium mining did in 2017.

[-]

CeFurkan@reddit (OP)

this is also super important. these gpus must run games on windows PCs to become wide spread

[-]

Patrick_Atsushi@reddit

I’m not sure about this. The normal use case of GPU in this spec is LLM training.

However the price makes me wonder if some people will buy it for gaming.

I think the gaming market is almost ignorable when compared to AI market. I might be wrong though.

[-]

Upbeat_Parking_7794@reddit

Nice, US will tax them to hell, but the rest of the world will have cheap AI.

[-]

No_Hornet_1227@reddit

Ive been saying for months, the first company, nvidia, intel or amd that gives consumers an AI gpu for like 1500$ with 48-96gb of vram is gonna make a killing.

FFS 8gb of vram chips of gddr6 costs like 5$. They could easily take an existing gpu triple the vram on it (costing them like 50$ at most and sell it for like 150-300$ more and they would sell a shit ton of em.

[-]

Mango-Vibes@reddit

Except for the fact that Nvidia is the most efficient no questions asked and also is supported by virtually everything

[-]

tryingtolearn_1234@reddit

Intel should have done this. Instead a Chinese company will get that market.

[-]

WithoutReason1729@reddit

Your post is getting popular and we just featured it on our Discord! Come check it out!

You've also been given a special flair for your contribution. We appreciate your post!

I am a bot and this action was performed automatically.

[-]

RemoteHoney@reddit

DeepSeek switched from Nvidia to this

So now they are left behind in competition

[-]

CHEWTORIA@reddit

China has 1.4 Billion people, that is huge market.

India has 1.4 billion people, that also is huge market.

Its all in Asia.

They will take over whole Asian market, if you dont see this happening, its coming.

yes, its not good as CUDA, yet, but that is mostly software issue.

Give it 5 more years and this problem will be solved.

[-]

happy-occident@reddit

Doesn't the mindspore issue get in the way of building locally? It doesn't play with ollama apparently?

[-]

Bestoftherest222@reddit

This only the beginning, its going to get better

[-]

OriginalAd9933@reddit

Different?

[-]

Comfortable-Cry8165@reddit

Has anyone tested them with local AIs? Any guides or videos? How are they compatible with the existing hardware?

What about gaming?

[-]

spookyclever@reddit

All I want is for this to cause the nvidia price point to dip to retail levels 😄 I’ll buy one of these to inflate interest if it means in a couple weeks I can get a 5090 at retail.

[-]