What's the most optimized engine to run on a H100?

Posted by Obamos75@reddit | LocalLLaMA | View on Reddit | 9 comments

Hey guys,

I was wondering what is the best/fastest engine to run LLMs on a single H100? I'm guessing VLLM is great but not the fastest. Thank you in advance.

I'm running a LLama 3.1 8B model.