Get you some GPUs, it's not worth the hacks around lack of RAM

Posted by MotokoAGI@reddit | LocalLLaMA | View on Reddit | 81 comments

https://preview.redd.it/w356ddr8ak4h1.png?width=550&format=png&auto=webp&s=f04238bf0d44f6defe58698c75f08d6c2581d4c2 https://preview.redd.it/nalt9p8mak4h1.png?width=550&format=png&auto=webp&s=b8fb2f366f176eab0003a5cc53e4736664d25659 If you can, get you some GPUs, all the hacks around limited vram is not worth the pain and effort. Even if it means getting P40s or MI50s. Get you enough GPU to have everything in memory. Qwen3.6-27B. 27B the dense model. Q8, f16 K/V cache, 128k context on 2 used 3090s. 1399 pp, 104 tg