5060ti 16gb or 5070 12gb for local LLM
Posted by soteko@reddit | LocalLLaMA | View on Reddit | 11 comments
As a title says, what is better taking the consideration that it will probably offload to CPU anyway?
Models Qwen 3.6 35b and maybe I am not sure it will be usable Qwen 3.6 27b...
CPU 5700x with 32GB dd4
cleversmoke@reddit
I'd personally go with the 5060ti 16GB, it's a great start and if the motherboard allows, can get another 5060ti 16GB. While the memory bandwidth isn't the same as an RTX 5090, 2x 5060ti's will be an affordable upgrade to 32GB vram.
soteko@reddit (OP)
I have other slot but it is x4, MB is ASUS TUF GAMING B550M-PLUS
KURD_1_STAN@reddit
If u not spilling into ram , that shouldn't have an impact beside model first load
horeaper@reddit
Try 7900XT
soteko@reddit (OP)
Well I can't find new or even used here.
horeaper@reddit
That's sad. My suggestion is wait for the 9070 (non XT) to reach a more reasonable price level. 5060Ti are just too slow, and 5070's VRAM are definitely not enough in current days, you'd better off spending all that money to deepseek API
Mashic@reddit
16GB.
jacek2023@reddit
think how to get two 16GB
soteko@reddit (OP)
I have other slot x4 on ASUS TUF GAMING B550M-PLUS.
Is it ok to put another 5060 ti in that slot ?
jacek2023@reddit
it's worth trying
Sad-Duck2812@reddit
I have seen people get a decent amount of tokens with even 12GB something like 60 tokens. I have also tested it on a 5070 12GB and managed 58-60 tk/s with cpu offload.
In my opinion get the 5060 ti 16GB it’s a very good budget gpu for AI models and you can even fit some models into it completely as it’s 16GB, even if you have to offload it’s better to fit as much of the model in gpu as you can.