Looking for Llama.cpp Alternative to Run Recent Vision Language Models on Apple Silicon

Posted by chibop1@reddit | LocalLLaMA | View on Reddit | 9 comments

I'm looking for a backend engine that can run recent VLMs (vision language Models) on Apple silicon.

I'm a huge fan of llama.cpp, but they hardly pay attention to VLMs any more since they dropped the VLM support from their server on March 7 2024.

Unfortunately none of the recent VLMs such as Qwen2-VL, Phi-3.5-vision, Idefics3, InternVL2, Yi-VL, Chameleon, CogVLM2, GLM-4v, etc are supported. Minicpm-v 2.6 is the only recent model that was added.

Instead of just waiting and wishing, I think it's time to move on and look for alternative. :(

Thank you for your help!