luckyj
-
Can't get over 250TPS on RTX5090 with Qwen3.5-4B
Posted by luckyj@reddit | LocalLLaMA | View on Reddit | 30 comments
-
Problem parsing thinking tokens on Openwebui with qwen3.6 on LM Studio
Posted by luckyj@reddit | LocalLLaMA | View on Reddit | 6 comments