vLLM on V100 for Qwen - Newer models

Posted by SectionCrazy5107@reddit | LocalLLaMA | View on Reddit | 15 comments

I am struggling to run vLLM on my V100 GPU. I am trying to run the newest models like Qwen 9B. I try the VLLM nightly + latest transformers etc still they dont work together. I am unable to make it run. Any advice will be much appreciated.