Qwen3-VL vs Qwen 3.5/3.6 for vision — worth keeping the old weights?

Posted by nikhilprasanth@reddit | LocalLLaMA | View on Reddit | 7 comments

Quick question for those who’ve used both extensively:

Has the Qwen3-VL series basically been fully superseded by the newer 3.5/3.6 models for vision tasks?

In other words, is there still any practical reason to keep the older Qwen3-VL weights around, or are the newer series better enough across the board that the old ones can be deleted without regret?

I’m mainly asking from a local-use perspective where storage matters, so I’m curious whether anyone still finds the old VL weights meaningfully useful for any niche cases.

[-]

lucasbennett_1@reddit

qwen3.5/3.6 supersedes qwen3 vl for most vision tasks. newer models handle ocr, spatial reasoning, and multi-image contexts better. only reason to keep old vl weights is if you've got fine-tuned versions or specific prompts that work better on the old architecture. otherwise its safe to delete, 3.5/3.6 is the current standard for vision work.

[-]

nikhilprasanth@reddit (OP)

I will be archiving the older ones. With gemma and qwen being multimodel by default now, I guess there is no point in keeping the older ones.

[-]

Objective-Stranger99@reddit

Yes, 3.5 and 3.6 have surpassed 3-VL in most metrics.

[-]

FoxiPanda@reddit

I have archived my Qwen3-VL series models at this point out to slow NAS storage.

The Qwen3.5/3.6/Gemma-4 families seem to outperform Qwen3-VL for my use cases.

[-]

nikhilprasanth@reddit (OP)

Yeah, I’m deciding along the same lines.

[-]

Woof9000@reddit

I wouldn't, or at least I'm planing to remove those, when I run out of space. I've not seen or heard of instance, or use case, where old qwen3-vl would be better than the new.

[-]

nikhilprasanth@reddit (OP)

Yeah, same here. Unless there’s some specific edge case where old Qwen3-VL is better, I’ll probably delete it once I need the space.