non-nvidia gpus | TheaterFire

[-]

floconildo@reddit

I'm also cheap, but the software maturity for these alternative GPUs pushed me back from buying one. If ROCm support is still somewhat wobbly for Strix Halo after a year, I can only imagine what it looks like for CANN. It has some of potential though, and chinese (esp. Huawei) usually catch up fast to developments.

If you just want raw power on non nvidia consumer-level hardware then Strix Halo, B70 or just a plain old Mac might be your best bet. Memory bandwidth is already an issue on the first two and the article you shared ain't exactly making up a good case for the 300i if you ask me.

[-]

Confident_Ideal_5385@reddit

ROCm support for gfx1100 (7900xtx etc) seems pretty solid with ROCm 6.2. I'm not regretting the purchases at all.

Vulkan on Linux TTM is another story entirely. A sad saga of betrayal by a memory manager that optimises compute memory allocations to avoid dropping frames in wayland. On Linux, stick to ROCm if you plan to use more than about 80% VRAM.

[-]

floconildo@reddit

ROCm support is not bad, but it can take a while for things to rollout to AMD devices (even more so if you're running an APU like me). But that's alright, Strix Halo is for my personal projects and I don't regret buying it at all, especially when I check my electricity bill haha.

Not to blame AMD engineers at all ofc. I fully understand that it's a hard game to catch up and honestly they've been doing a great job so far all things considered. Swimming against a sea of CUDA users must be tiring and I really hope other departments at AMD are doing their part to increase adoption.

[-]

ZCEyPFOYr0MWyHDQJZO4@reddit

If your goal is only to run consumer-level models then you probably shouldn't get a non-Nvidia GPU or Apple system. As an independent developer you can't replicate the work effort necessary to get hardware to work in the first place. There is no free lunch here.

[-]

Confident_Ideal_5385@reddit

As an independent developer, it's non-trivial but not at all impossible to get this stuff working, and even get your PRs merged.

Total yak shaving compared to, y'know, running LLMs tho.

[-]

Ok-Secret5233@reddit (OP)

Right, unfortunately I suspect that is going to be the bottom line for me.

[-]

ZCEyPFOYr0MWyHDQJZO4@reddit

Things used to be so much cheaper a year ago.

[-]

SSOMGDSJD@reddit

What is your use case?

That Huawei card is a dead end, needs a Chinese CPU and Mobo to function. https://youtu.be/qGe_fq68x-Q?si=71WWyt6NcFXVyTG9 see relevant gamers nexus video.

Cheapest GPU I would consider is a v100 16gb sxm2 ($100), sxm2 to PCIe adapter (50-100), and an arctic p8 max to cool it ($10). The v100 sxm2 32gb fits much better models but is $500 these days. Cheaper than that, an mi50 32gb can be found on Alibaba for around 400 as of last month.

Intel just isn't there price to performance wise, b70 is a grand and is worse than a v100 32gb for bandwidth and driver support for more money. Maybe improves, idk, Intel has a bad track record of supporting their promising tech though. If you're spending a grand, 3090 24gb or v100 32gb.

A770 16gb ($250) sounds interesting but it's more than a v100 16gb for worse bandwidth and more jank.

Mi50 16gb is like 150 on eBay, but you might as well cop the Nvidia support with the v100 16gb for a few more dollars.

Tldr just get a v100

[-]

Ok-Secret5233@reddit (OP)

Why are some V100 open and others closed? Looking at the results on ebay, the open ones look substantially cheaper. Is it just aesthetics or...?

[-]

SSOMGDSJD@reddit

Not sure what you mean? The skinny flat gray ones do not have a heat sink. You would need to get one to attach to the the GPU and the spring screws can be annoying. I would recommend getting one that looks like this. Heatsink attached, just need the sxm2 to PCIe adapter and probably a riser cable and bracket unless you turn your PC on its side so that it stands straight up out of the PCIe slot. It's a big chunky boi

[-]

Ok-Secret5233@reddit (OP)

I mean, what's the difference between these two?

https://www.ebay.co.uk/itm/167792617369

https://www.ebay.co.uk/itm/198270108538

[-]

SSOMGDSJD@reddit

The first one is sxm2. It uses a proprietary interface instead of PCIe, meant for datacenter servers.

The second is pcie native, note the gold teeth at the bottom. Plugs straight into a pcie slot. Easier, more expensive generally. I believe they still need you to supply cooling with a fan and a shroud.

For the first one, you'll need something like this: https://ebay.us/m/mcekFn which bridges the PCIe lanes from your motherboard to the sxm2 pins on the GPU. For that particular one you linked you would also need a sxm2 v100 heatsink and thermal paste, as well as a fan like an arctic p8 max (high static pressure) to cool it.

[-]

Ok-Secret5233@reddit (OP)

Thank you!

[-]

Hedede@reddit

The issue with V100 is that they have very high idle power. I left 4xV100 idling for a day, and in 24 hours they consumed almost 10kWh just from idling.

[-]

FullstackSensei@reddit

Here's a crazy idea: shut down the thing when not in use. Will beat even the most frugal idle power consumption.

It takes 5 minutes at most to startup and load a model. You can use waken on LAN or IPMI if your board has it to wake the system. Pair it with tailscale or VPN, and you can start it from anywhere on the planet.

[-]

SSOMGDSJD@reddit

Tailscale is so goated. Checking on Claude code (wrong sub to be mentioning this I know) on my local computer while I'm out and about is peak, except for mosh eating my terminal history lmao. Small grievances.

[-]

sekh60@reddit

Burn the heretic!

[-]

FullstackSensei@reddit

🤷🏻‍♂️

[-]

SSOMGDSJD@reddit

Fair point but that's like a dollar per day at avg us residential rates for 4 gpus

[-]

xandep@reddit

2x mi50 16gb w/ integrated cooling for 200 something in alibaba. Can run Qwen 3.5 35B, 27B and the new Gemmas. Or just one if you are ultra cheap, running 35B w/ ncmoe (some 27B and 26B quants if willing to quantize to Q3, IQ4 top).

[-]

ccbadd@reddit

I think I would get a 32GB V620 for about $400 and add a $25 cooler just so I would only need 1 slot and have a card that is still officially supported by ROCm.

[-]

International-Try467@reddit

AMD GPUs work with ROCm and Vulkan

[-]

nakedspirax@reddit

Yeah they work.

For certain use cases like image/video generation, NVIDIA wins by a mile + some

[-]

Fit-Produce420@reddit

In speed.

I generate image and video to the max length supported by wan, ltx, etc.

It just takes longer.

[-]

adeadfetus@reddit

That’s the same as saying CPU and RAM is just as good as GPU except they take longer.

[-]

Fit-Produce420@reddit

No, it's not.

You can't run rocM or Vulkan on your CPU+RAM.

You're completely ignoring how APUs work.

[-]

adeadfetus@reddit

You completely missed the point but ok.

[-]

RoomyRoots@reddit

ROCm got much better, like much, much better. Sure it was because it was laughable some years ago and there is still lots to grow but if you get a compatible card it's not hard to set things up.

[-]

fallingdowndizzyvr@reddit

You won't get better price to performance than a V340. 16GB of VRAM for $49. And now with TP in llama.cpp, you can TP both GPUs on that card.

[-]

LankyGuitar6528@reddit

Where are you finding one for $49?

[-]

fallingdowndizzyvr@reddit

Ebay. There are a couple of sellers that sell them for $49. Don't pay more. While they claim they are "used", the seal on the static bag was intact on mine. And there wasn't a speck of dust on it or even any wear on the fingers. So mine seemed new.

https://www.ebay.com/itm/306835007605

[-]

LankyGuitar6528@reddit

Thanks!

[-]

Several-Tax31@reddit

Yeah, seems incredibly cheap to me. You cannot get regular RAM for those prices, no?

[-]

jpedlow@reddit

And don’t forget the intel b70 just got released

[-]

semangeIof@reddit

I'm still amazed this card sold out its initial wave on Newegg so quickly. Even though it's 32GB VRAM its low bandwidth and also a fairly slow chipset.

Intel cards also run inefficient on Vulkan as of now. SYCL is hardly mature. Some models run okay when Intel works direct with vendor (ex. Gemma 4 on vLLM) but you still get slower Tok/s compared to even a legacy card like an RTX 3090 because of a) super low power chipset, b) low memory bandwidth, and c) CUDA being so superior to Intel's ecosystem.

There is a reason people you can trip over B60s wherever you look and it is the same reason the B70s will not sell out again following their restock on April 24.

[-]

CelvestianNesy@reddit

Unfortunently, driver support is experimental and we have yet to wait until Intel adds more support, finicky stuff. Good VRAM but, yeah.

[-]

no-adz@reddit

1300 euros!

[-]

leonbollerup@reddit

wait a min.. you are cheap.. and want to play with AI ?!?..HAHAHHA...

Be like the rest of us.. be poor.. but with a shit-ton of cool hardware that we use to create picture w.. ...cats! ;)

[-]

overflow74@reddit

okay the ascend hardware is really nice but their software (cann toolkit) isn’t really mature enough like cuda you’ll find yourself struggling alot with errors and wasting time on fixing things that you wouldn’t normally face with a normal nvidia gpu however if you want, you could try it out first on huawei’s cloud and try the “mindspore framework “ , they have like a clone from everything for the ascend hardware

[-]

overflow74@reddit

in addition, you’ll have limitations to what you can run eg. vllm ascend supported models

also quantization/training is usually supported for specific set of cards (don’t remember the list exactly) so heads up haha

[-]

Nexter92@reddit

You say "I am cheap", i you are, then you are a slave ? 😆

Maybe "I am poor" no ?

[-]

666666thats6sixes@reddit

"I'm cheap" in this context means stingy, frugal, as in j'suis radin

[-]

Creepy-Bell-4527@reddit

No, he meant he is cheap. And often rich people are some of the cheapest bastards you'll ever know.