Best model für rtx 3060 ti 32gb ddr5 ram?
Posted by Jacket124@reddit | LocalLLaMA | View on Reddit | 6 comments
Thank you in advance
Posted by Jacket124@reddit | LocalLLaMA | View on Reddit | 6 comments
Thank you in advance
Sadman782@reddit
Gemma 4 26B A4B IQ4_XS, partial offloading to the system ram
raketenkater@reddit
yes with active experts on gpu this tool handles this besides auto tuning the flags https://github.com/raketenkater/llm-server
Jacket124@reddit (OP)
Is ollama also ok?
raketenkater@reddit
Yeah ok but not the fastest
Jacket124@reddit (OP)
So I should use this tool?
Jacket124@reddit (OP)
Will it be fast enough? Like I’ve seen someone using Gemma 4 26b a4b with an better system and it took unusable long to create anything. Or is this a special version?