How to configure Self speculative decoding properly
Posted by milpster@reddit | LocalLLaMA | View on Reddit | 6 comments
So now that we have self speculative decoding in qwen 3.6 on llama.cpp i was wondering if anyone had any advice about configuring it properly.
6 Comments
srigi@reddit
lum4chi@reddit
srigi@reddit
Jester14@reddit
qubridInc@reddit
Objective-Stranger99@reddit