What is the best inference model you have tried at 64gb VRAM and 128gb VRAM?

Posted by seoulsrvr@reddit | LocalLLaMA | View on Reddit | 5 comments

I'm using the model to ingest and understand large amounts of technical data. I want it to make well reasoned decisions quickly.
I've been testing with 32gb VRAM up to this point, but I'm migrating to new servers and want to upgrade the model.
Eager to hear impressions from the community.