Worth it to buy a modded 22GB 2080ti for LLMs?
Posted by define_undefine@reddit | LocalLLaMA | View on Reddit | 40 comments
I can get my hands on a 2080Ti with 22GB of VRAM instead of the original 11GB, for around $500/£400. Would you say this is worth it / the markup is fair given that unmodded variants can be had for slightly over half the price?
Right now I'm using A4000s which have been okay, but the 16GB limit is sometimes difficult to work around for running larger models. Also if buying the modded 2080Ti, I plan on getting two and connecting them using nvlink for better performance.
Given the emerging trend of better performing models running with less and less VRAM (phi, mixtral etc), one concern is buying this now and not needing it in a few months time.
Would appreciate any pros/cons that I might not have considered! Thanks
ntkwwwm@reddit
Did you get the two 2080ti 22gb? How is it working?
jeromeibanes@reddit
I ordered one from customgpu_official, I'll find out soon!
XDjBSetp@reddit
how is it?
jeromeibanes@reddit
It works as expected, in hindsight, I should probably have gotten a 3090ti, I think there are some around $850 on ebay. I did put it in a egpu case, so I use it from my lenovo laptop via thunderbolt 3, I had a few difficulties finding a wire long enough (6ft) but I managed to find one that works. Happy to provide more details. But overall those cards work well. They are likely refurbished cards (they may have been used to mine crypto in the past, who knows, who cares), like they are not new, but they work as advertised. I did use it only for ~10 hours so far, and the payload was mostly stable diffusion (from linux). Let me know if you want me to try anything else. I did not try to connect a monitor to that card; although I see no reason why that wouldn't work as a normal gpu.
GasLongjumping9671@reddit
You still got it? How is it holding up in 2026?
jeromeibanes@reddit
I still have it, I haven't used it for AI in a few weeks but I have it in one of my desktops, and it works fine.
Status_Contest39@reddit
ls try Qwen 34B Q4 inference, whose size should be below 20GB for 1pcs of RTX2080Ti 22GB
rtx2080ti44gb@reddit
I am currently using the rtx 2080 ti 22gb well and am purchasing and shipping another one. I'm trying to use two devices connected via NVLINK. Will there be any problem? You don't have to use nvlink, but will performance improve if connected?
Same conditions: 2080 ti 22g (pcie 5 * 8) + 2080 ti 22g (pcie 4 * 4)
Connect 2 monitors (HDMI, DP): Built-in graphics (13500) VS GPU 0 (2080ti)
What is the speed difference between NVLINK connection and no connection?
Please let me know your answers to questions 1 and 2. Brothers, please help me
GasLongjumping9671@reddit
Any update?
Firm-Customer6564@reddit
You tried?
blackenswans@reddit
Why would anyone get this over a used 3090?
Larimus89@reddit
Old post but yeah for sure, it’s not that much cheaper than a used 3090. It’s also probably not any faster really than just a good CPU, and a shit ton of DDR ram, high speed, and load in way bigger models for cheaper. No gpu solution I think is better for LLM, or just a 3090 if you want very fast responses.
define_undefine@reddit (OP)
I can only get a used 3090 for around $890/£700, which is one GPU with 24GB VRAM vs. 2 GPUs with 44GB VRAM total for slightly above the price of a single 3090
AnonymousCrayonEater@reddit
Something you are overlooking is compatibility and help from the community. Far more people have done what you are trying to do with 1 or 2 3090s.
mvvagner@reddit
To be fair, the 2080 ti is a very capable card. I use mine for LLMs and Stable diffusion all the time and it's snappy enough for me. OPs point is a 3090 is much more expensive. If by help from the community, you mean the community will offer up money to buy a more expensive GPU, otherwise I don't see the point you're trying to make
AnonymousCrayonEater@reddit
For dealing with edge cases and compatibility issues it’s usually best to go with the most commonly used configuration. That’s all I was referring to. You can get up and running with a 2080 no problem.
mvvagner@reddit
I see what you're saying
Status_Contest39@reddit
too expensive
AspectSpiritual9143@reddit
Can't say if that's worth it but I bought a modded 2080 Ti recently in China for use of GPU virtualization, and the price is 2400 Yuan, so less than $350. The price is low because there are a lot of modders for this card even before the recent 4090 ban for model training. This is a big enough price gap so you could consider the risk of importing one from China directly
supergoob29@reddit
was the card good?
AspectSpiritual9143@reddit
i use it for hosting multiple gaming vms and for that it has been fine
mr-prez@reddit
Do you host those VMs concurrently or just one at a time?
AspectSpiritual9143@reddit
I usually just use 1 since I'm the only player. But I have tested and I could run 2 gaming VMs concurrently. The whole point of the setup is to allow another friend to come play together.
Mel_Gibson_Real@reddit
Ive been looking to do the exact same thing, hows your performance been? Do you have any idea if the extra Vram helps for titles that can run on half a 2080?
mr-prez@reddit
To clarify: I mean to ask if you're using the same 2080ti between both VMs at the same time? Or are you using multiple graphics card to achieve that?
Status_Contest39@reddit
If this modified RTX2080 TI is a turbo version, it is most likely to be a mining card that has been mined. It needs to be tested for health, otherwise it will easily cause blurred screens, blue screens, and black screens under high load conditions. Because the modified video memory easily emits a lot of heat, if the cooling system is not robust enough, it will lead to instability of the high-temperature video memory. I have seen several people discover some problems within a few weeks of buying the building. Of course, there are also cases where the building went smoothly after half a year of use. Therefore, buying this modified 22GB graphics card is not only a lottery but also an exciting activity of gambling on luck.
noth606@reddit
Ehm, obviously buying a hacked semi DIY modified unsupported GPU is risky, it's built into the whole proposition. Anyone who doesn't understand that should not be touching a screwdriver to begin with.
Expecting it to work flawlessly like a brand new A class brand name product in every respect would be childish and dumb, it may need extreme cooling, it will defo need attention with drivers, it may need to be babied in other ways - it's a Hot Rod GPU. Obviously buying it is a gamble, no support, no warranty, you're on your own. But the potential upside is huge.
Temporary_Payment593@reddit
Yes, definitely worth it! It was reported that 2080ti 22g almost the same with 4090 when running 6B llm. Moreover, 2080ti supports nvlink! You can get 44g vram by connecting two 2080tis, and only cost $1000.
Cyber-exe@reddit
nvlink cost 100-200 dollars and is restrictive on how far you space the cards apart. What exactly does this accomplish?
SnooSongs5410@reddit
You can buy 2 2080 ti's w/ 22GB for the price of a single 3090. I'm not sure how it makes sense to buy the 3090. If you are buying a second-hand card anyways it seems like the 2080 is a lot more bang for the buck. With no NVLink on the 40 series it's pretty pointless buying into them. And no PCIe 5.0 till at least the 50 series and probably crippled.
My question is to anyone who is running the card(s). Any issues with software compatibility? What kind of performance are you seeing single and paired? How big a model can you load w/ the 44GB? i.e. Can you load 70GB llama or is it more like 7B?
Thanks
snoo
Status_Contest39@reddit
I used 2XP40 to run llama3 70B q4_0 model with a speed at 8\~9t/s steadily. I think 2XRTX2080 Ti should work as well may be faster. I am wondering whether to change to those and seeking anyone who run them already with LLMs
Cyber-exe@reddit
The 2080 Ti would run you 78% faster based on memory bandwidth. I'm considering this because the P40's need noisy and space consuming cooling rigs, and the speeds look to slow to make 3x work unless you really don't mind waiting around for the sake of running really large models.
You said you get 8-9t/s so lets say 8.5 average. With 78% more speed you reach 15t/s and possibly more depending on how overclockable the memory is, 10% memory OC could push that to 16.5t/s
Issue with 3090 is they all need 3x 8 pins, can't do a single 2x8 daisy chain cable. You just might fit an extra 2080 Ti before the PSU is out of cables or sheer wattage falls short.
RayIsLazy@reddit
If it was cheaper maybe, but at 500 you might as well try to find a 3090
Frosty_FoXxY@reddit
Please tell me wheree you are finding 3090s for under 600 because thats impossible here even used in the US
tensorwar9000@reddit
yes, I would totally do it. I own a 3090 and 2080ti, and I was looking to upgrade the 2080ti and to my shock, it was still faster than the 4060ti 16gb vram. The 2080ti will last you a decade in AI. I've had mine since 2019 and I'm using it every day. To bad it only has 11gb of vram!
CoffeeSnakeAgent@reddit
What setup do you have? And what models do yoh use? I have a 2080ti on a threadripper 2950. I am just about to embark on this journey.
tensorwar9000@reddit
on my 2080ti I am using vLLM with OpenChat 3.5 in AWQ 4bit mode, getting 70-100 tokens per second
Status_Contest39@reddit
the speed is impressive
deleted_by_reddit@reddit
I wonder if anyone do RTX3090 48 GB mod and how much would it cost.
Status_Contest39@reddit
BIOS did not work, someone had tried it.