Qwen3-VL vs Qwen 3.5/3.6 for vision — worth keeping the old weights?
Posted by nikhilprasanth@reddit | LocalLLaMA | View on Reddit | 7 comments
Quick question for those who’ve used both extensively:
Has the Qwen3-VL series basically been fully superseded by the newer 3.5/3.6 models for vision tasks?
In other words, is there still any practical reason to keep the older Qwen3-VL weights around, or are the newer series better enough across the board that the old ones can be deleted without regret?
I’m mainly asking from a local-use perspective where storage matters, so I’m curious whether anyone still finds the old VL weights meaningfully useful for any niche cases.
lucasbennett_1@reddit
qwen3.5/3.6 supersedes qwen3 vl for most vision tasks. newer models handle ocr, spatial reasoning, and multi-image contexts better. only reason to keep old vl weights is if you've got fine-tuned versions or specific prompts that work better on the old architecture. otherwise its safe to delete, 3.5/3.6 is the current standard for vision work.
nikhilprasanth@reddit (OP)
I will be archiving the older ones. With gemma and qwen being multimodel by default now, I guess there is no point in keeping the older ones.
Objective-Stranger99@reddit
Yes, 3.5 and 3.6 have surpassed 3-VL in most metrics.
FoxiPanda@reddit
I have archived my Qwen3-VL series models at this point out to slow NAS storage.
The Qwen3.5/3.6/Gemma-4 families seem to outperform Qwen3-VL for my use cases.
nikhilprasanth@reddit (OP)
Yeah, I’m deciding along the same lines.
Woof9000@reddit
I wouldn't, or at least I'm planing to remove those, when I run out of space. I've not seen or heard of instance, or use case, where old qwen3-vl would be better than the new.
nikhilprasanth@reddit (OP)
Yeah, same here. Unless there’s some specific edge case where old Qwen3-VL is better, I’ll probably delete it once I need the space.