PermanentLiminality
-
Is speculative decoding available with the Qwen 3.5 series?
Posted by PermanentLiminality@reddit | LocalLLaMA | View on Reddit | 8 comments
-
How can we run Qwen3-omni-30b-a3b?
Posted by PermanentLiminality@reddit | LocalLLaMA | View on Reddit | 45 comments
-
CPU only performance king Qwen3:32b-q4_K_M. No GPU required for usable speed.
Posted by PermanentLiminality@reddit | LocalLLaMA | View on Reddit | 24 comments
-
Poorman's VRAM or how to run Llama 3.1 8B Q8 at 35 tk/s for $40
Posted by PermanentLiminality@reddit | LocalLLaMA | View on Reddit | 26 comments