Looking for Suggestions — Single 5090 & 64gb DDR5
Posted by icedgz@reddit | LocalLLaMA | View on Reddit | 33 comments
Hi Reddit,
I am planning on running Qwen 3.6 27b NVFP4 via vLLM on my 5090 but was wondering if something like 35b a3b at Q8 on Llama would produce better results for agentic coding and utilize the system memory. My research says no but if that’s the case what would yall do to utilize the system memory?
33 Comments
uti24@reddit
Last_Mastod0n@reddit
icedgz@reddit (OP)
Last_Mastod0n@reddit
Anbeeld@reddit
LA_rent_Aficionado@reddit
Anbeeld@reddit
LA_rent_Aficionado@reddit
Anbeeld@reddit
LA_rent_Aficionado@reddit
uti24@reddit
BitGreen1270@reddit
RMK137@reddit
amberdrake@reddit
PermanentLiminality@reddit
Current_Ferret_4981@reddit
ecl_55@reddit
Current_Ferret_4981@reddit
sword-in-stone@reddit
Worldly-Plastic-2516@reddit
ecl_55@reddit
wizoneway@reddit
ProfessionalSpend589@reddit
fasti-au@reddit
icedgz@reddit (OP)
pand5461@reddit
romrick4@reddit
Qwen_os_has_died@reddit
FullstackSensei@reddit
looselyhuman@reddit
grabber4321@reddit
Top_Training5738@reddit
jacek2023@reddit