Will the release of Intel's B70 32gb Card bring down prices of other 32gb cards?

[-]

No-Refrigerator-1672@reddit

Amount of VRAM is the secondary charasteristic, the primary one is software compatibility. Both Intel and AMD can make a GPU that is ten times as performant as Nvidia, and nobody will buy it because it'll take months of work to get proprietary software stacks running on them. They can only compete once they match the compatibility of CUDA. However, once that happens, it's actually more likely for Intel and AMD prices to go up, rather than Nvidia going down.

[-]

Kilo3407@reddit

Why are you getting downvoted? Im new to the space, tried to justify use of a 16GB AMD card, but both chatgpt and Claude pointed me strongly to NVIDIA for CUDA and a relatively headache free experience.

Someone please enlighten me?

[-]

florinandrei@reddit

If you actually do development, as in: write Pytorch code, build models from code, train them, etc, then NVIDIA with CUDA should be your first preference, yes.

If all you do is inference, as in: you download LLMs from the Internet to run them in Ollama, llama.cpp, vLLM, etc, then it doesn't matter. Use whatever works for you.

[-]

No-Refrigerator-1672@reddit

If all you do is inference, as in: you download LLMs from the Internet to run them in Ollama, llama.cpp, vLLM, etc, then it doesn't matter. Use whatever works for you.

Nope, it doesn't work this way. A lot of OSS solutions come prepackaged with CUDA acceleration, i.e. RagFlow uses AI to detect document markup for scans, and it can't be done over API. You have an option to either use CUDA acceleration, or use CPU, or spend god knows how much time to repackage the solution with AMD or Intel libraries. You will find cases like those all over the github.

[-]

florinandrei@reddit

Maybe a lot of obscure solutions.

The major inference providers work well on all important platforms. Little stragglers - eh, they always struggle.

[-]

No-Refrigerator-1672@reddit

Lol, RAG Flow that I've cited as example has almost as much Github stars as llama.cpp. Your definition of obscure is very loose.

[-]

BringMeTheBoreWorms@reddit

I think that’s a good idea example of where you need to step out of the ai bubble and actually see for yourself. AMD cards are pretty seamless to get up and running and for a fraction of the price.

Sure you’ll get more speed from a 5090 but 6x the cost of what I got an 7900xtx for is just not worth it. Two xtx cards and I have 48 gb and having no issues

[-]

Kilo3407@reddit

How does your setup work for local Gen AI video?

[-]

BringMeTheBoreWorms@reddit

I dont use it for that actually so couldnt say. Id like to try but dont have the time right now

[-]

No-Refrigerator-1672@reddit

Because people don't get how the commerce works. They point their fingers at "but look, llama.cpp, comfyui and (sometimes) even vllm work!" and think that this is enough. IRL, people who buy in bulk often have their own software, either based on OSS or even completely custom, and they think about the price of finding a specialist who can code for another architecture, and the potential expense differences in man-hours to get it set up and running. Unless you're creating a datacenter, hardware is cheap, specialist time is the real expense. This is the real reason why top options by both AMD and Intel are 1/4 of the Nvidia's price while not being 4 times slower.

[-]

BringMeTheBoreWorms@reddit

I just wonder what the support is like, and the memory bandwidth of ~600GB/s isn’t exactly mind blowing.

The AMD xtx has 24GB with 960GB/s, so 2 of those gets you 48GB with higher memory bandwidth.

[-]

RemarkableGuidance44@reddit

They are actually very good, already some good reviews out on them.

[-]

BringMeTheBoreWorms@reddit

From what I’ve seen so far it’s nothing mind blowing and seems to be similar to other cards with that memory bandwidth. Unless there’s been new builds and support released recently?

[-]

RemarkableGuidance44@reddit

They are a lot cheaper, I was able to get 128GB of Vram for $4000 USD... Drivers are getting better and more AI users are buying.

[-]

BringMeTheBoreWorms@reddit

Definitely good to have more alternatives to green. Just wish they’d gone for higher memory bandwidth to squash the gap between the other cards.

[-]

RemarkableGuidance44@reddit

Yeah cards are already sold out now. I should of got more. :P

[-]

BringMeTheBoreWorms@reddit

Also having a motherboard that can accept 4+ cards is nice but it adds to the cost

[-]

ImportancePitiful795@reddit

The only 32GB card that's in that price bracket is the R9700 which doubt it.

5090 is 4 times more expensive and more likely will go up than down.

So at this point people should consider buying on budget and stick the finger to the overpriced products. Namely NVIDIA.....

[-]

pfn0@reddit

the sad reality is, the 5090 offers like 4x the compute and 3x the memory bandwidth... so if you scale compute and bandwidth per dollar, it's justified.

[-]

ImportancePitiful795@reddit

Is NOT justified because can have 4xB70s for a single 5090.

So what's better to run LLMs? 128GB VRAM or 32GB VRAM?

[-]

pfn0@reddit

it's the good ol' adage: good, fast, cheap. pick 2

[-]

super1701@reddit

Cheap, good. Every day. Speed is a luxury imo for local LLMs.

[-]

CalligrapherFar7833@reddit

Lol nvidia shill

[-]

ImportancePitiful795@reddit

Aye. Seems they want to stick on 4B-30B heavily quantised models, instead of getting big models at BF16/FP16 locally.

[-]

someone383726@reddit

If you have some facts to refute the statement then do it

[-]

ImportancePitiful795@reddit

I can refute your statement EASILY.

Try to run 70B Q4 model on 5090 and lets us now it's speed 😁Or Gemma 4 31B FP16/BF16.

5090 still has 32GB VRAM and it will ALWAYS be slower than 4x B70 when filling up 128GB VRAM.

[-]

Ok-Measurement-1575@reddit

Even the 3090 smashes it.

[-]

Puzzleheaded_Base302@reddit

5090 run more than 4x faster than B70 at the moment for single user LLM use case.

[-]

ImportancePitiful795@reddit

Try to load a B70 FP8 model or bigger to 5090.

[-]

Dry_Sheepherder5907@reddit

I really hope so because this is simply unacceptable... the prices are high as hell

[-]

rawednylme@reddit

Until software is sorted out, the B70 will do nothing to the price of other cards.

[-]

Puzzleheaded_Base302@reddit

if you count the token generation rate. B70 is 1/3 of RTX PRO 4500 32GB. And, it is priced at 1/3 of cost of RTX PRO 4500.

At the end of the day, price matches final performance not VRAM size.

[-]

ZealousidealShoe7998@reddit

b70 has 32gb but doesn't work as good as a nvidia 5090 32gb due to software optimization.
for the prices to come down everyone would have to shift to intel and intel would have to spend months of performace update through firmware and software support to make nvidia drop the price.

amd is another competitor which doesn't really move the needle that much. they had a 32gb card for a while but it doesnt performace the same as a 5090

so there no really competion here since b70 vs amd amd still have an advantage of a more mature software so they can stay in price fine.

anything agaisnt nvidia is kinda pointless to compare because their cards are on their own leagues.

if you want 32gb just choose a card that fits with your needs and go for it now. is not like we gonna magically have the release or new TPUS next week with great software support and decent amount fo ram that gonna make any graphic card obsolate enough to lose value.

IMO i'm probably gonna get a intel card in the near future but i'm also gonna try to help the community by improving the software support .

[-]

Ok-Measurement-1575@reddit

My thoughts?

Are there any real benchmarks yet? Not using 4b models or Qwen's worst model ever, 30b coder?

[-]

sittingmongoose@reddit

From what I’ve read people have had a hard time getting good performance out of the 9700.

To answer your question directly though, no. This isn’t a, price is high because of demand situation. The issue is there is no memory supply and the costs to buy memory are extremely high. I’m willing to bet Intel isn’t making money on these cards.

[-]

Thanks-Suitable@reddit (OP)

Wishfull thinking ig 🥲