Titan RTX

Posted by Brooklyn5points@reddit | LocalLLaMA | View on Reddit | 20 comments

What’s the deal with these cards? They have more tensor cores than a 4090, same ram. Going used for $700 bucks. Also NVLink which is what the h100s run.

Only short coming was the cuda core count. But can we just use the tensor cores? Someone in hardware explain?

[-]

DeltaSqueezer@reddit

Old Turing generation and older slower tensor cores.

[-]

Brooklyn5points@reddit (OP)

Has Anyone tried to use several of them? the NVLINK seems like a great advantage. I have 3x 3080s right now, working hard. I can get a 70B model to run, but its more e Mail than a conversation.

[-]

DeltaSqueezer@reddit

3090 is faster than 3080 and has 24GB VRAM. 3090 has a faster NVLink than Turing.

[-]

3090 is also not 2 slots unless you buy the very expensive turbo models or chinese blower style cards they sell for way too much. titan rtx is still cheaper than 3090 and is 2 slots. if you care about space efficiency on a workstation or rack, it still has value.

[-]

DeltaSqueezer@reddit

You can convert any GPU to a more narrow one if you want. But better way would be to keep it 3 slot and put it on a dedicated PCIe switch.

[-]

No-Comfortable-2284@reddit

I wouldnt buy any generation older than Ada. no fp8 is a punch in the face for anyone not single user inferencing

[-]

Herr_Drosselmeyer@reddit

There may be more but they're not the same architecture (2nd gen vs 4th gen). It's a bit tricky to find reliable information but I've read anything from the 4090 being slightly better in Tensor TFLOPS to twice as fast... it's annoying.

Suffice it to say that the newer architecture, faster VRAM, Cuda cores etc. there's no doubt the 4090 outperforms the Titan easily. The same question came up vs the 3090 back in the day and that card too would outperform it.

Still, the Titan was an absolute beast back in the day and it's still usable seven years after it launched. Of course, it did cost a ridiculous $2,500 at launch so that would be $3,200 in today's money. Who would be crazy enough to pay that much for a graphics card?

Oh, wait...

[-]

Brooklyn5points@reddit (OP)

The VRAM on the Titan was 24 GB of DDR6. Same as the 4090. The cuda core count is much lower, that seemed like the only draw back.

[-]

No-Comfortable-2284@reddit

another draw back is the architecture. 4090 supports later quantization methods such as fp8 and bf16 while turing is stuck on fp16. Turing cards also dont have support for flash attention 2 etc which is quite handy for large llm context. if youre just using the cards for single user inference via ollama etc it doesnt matter too much as you can just load awq or gguf 8bit 4 bit quantized models but if you are using these cards for multiple user deployment via vllm etc, you will need native hardware compatibility for these newer features to save immense amount of vram on model weights and kvcache.

I personally have 2 rtx pro 4500 32gb blackwell cards and a dgx spark for vllm deployment and 2 titan rtx in sli for personal single inference tasks.

[-]

Minimum_Inevitable58@reddit

I'm not sure how much it matters but the bandwidth speed is almost 300 gb/s slower too. This benchmark graph of gpus using stable diffusion might give an idea of the performance differences.

https://cdn.mos.cms.futurecdn.net/RtAnnCQxaVJNYgA4LbBhuJ.png

I don't pay much attention to anything I can't afford but the difference between the 4090 and other models even the 4080 is absurd.

https://technical.city/en/video/TITAN-RTX-vs-GeForce-RTX-4090

https://technical.city/en/video/GeForce-RTX-4080-vs-GeForce-RTX-4090

There's some very drastic differences in the specs. I'm not sure which specs might actually matter for AI but just look at the transistor count differences. 'Floating-point processing power', ROPs, clock speed, TMUs, and you mentioned it already but that is an insane difference in cuda core counts.

https://old.reddit.com/r/LocalLLaMA/comments/1br6yol/myth_about_nvlink/

I just looked up nvlink so you might know this already but it looks like it's only good for training. If that's what you want it for then I'd might try to find some speed increase numbers specifically with the titans or whatever else uses that architecture, the Turing Quadros I'm guessing would also have nvlink and would likely have more data and training tests ran than the titan. They're saying 30-40% increase in training speed for 3090s. Assuming it's similar for Titan RTX and if you only care about training then I'd say they're probably reasonably priced. Still the 3090 would be better overall.

https://old.reddit.com/r/MachineLearning/comments/jhof1z/d_simple_benchmarks_of_rtx_3090_vs_titan_rtx_for/

https://github.com/eugeneware/benchmark-transformers

[-]

SwingNinja@reddit

I've been googling 3090, 4090, and Titan cards for the past few days. It depends on what you want and your budget, I guess. It's less known that 3090, and I think that contributes to lower price. 3d rendering and gaming are slower. But you could save up to 200 bucks plus electricity cost since it uses lower wattage. This is a good benchmark report.

https://technical.city/en/video/TITAN-RTX-vs-GeForce-RTX-3090

[-]

smcnally@reddit

Form factor & compatibility is the Titan‘s 2-slot width to the 3090’s 3-slots. Makes a difference to me. I hope it runs smoothly and cool.

[-]

Araiebowhi@reddit

Hard to say they are worth the current price. I'm still running a older Titan XP but in 2015 the difference was alot bigger to the normal cards. Figured once the RTX titan drops in price some more it might be worth it but newer gen cards are sti around the same price.

Also haven't found a game currently that I absolutely need a new card. Everything still runs 1440 on high settings on average 40-60fps + depending on the game. (Ai and upscaling is - garbage excuse for devs to not optimize games and the consumer side just eats it up)

[-]

epycguy@reddit

the titan Xp is going used for $100-$200 each sir don't get ripped off.

[-]

Minimum_Inevitable58@reddit

Titan Xp is 2 gens older than what op is talking about. Xp = better 1080 = GTX card

[-]

epycguy@reddit

oh i thought it was the newest didnt realize about rtx titan lol. they have used this name way too many times..

[-]

HiddenoO@reddit

It performs \~22% worse than a 3090 in gaming so don't expect it to be any better than that for AI.

[-]

Brooklyn5points@reddit (OP)

But it has over 500 tensor cores vs like 70 in a 3090. This isn't gaming.

[-]

HiddenoO@reddit

The 3090 has 328 tensor cores, not "like 70", and the majority of operations are still done on the shading units of which the Titan has roughly half.

[-]

Brooklyn5points@reddit (OP)

ok very thorough. Its just an interesting card, they put so much into them. And it took the 50 series to get back to that many # of Tensor cores. Performance is another story. Thank you