Strip Qwen3.6 dense of its multimodal capabilities

Posted by redblood252@reddit | LocalLLaMA | View on Reddit | 27 comments

This may be naive but if we stripped a model of its image processing/voice processing capabilities, can it make it smaller or faster? Is that even possible? Does it vary between MoE and dense?

If it is, why isn't it done on popular models