Training on 8x v100 32GB with NVLink or 2x RTX Pro 6000?

Posted by ClimateBoss@reddit | LocalLLaMA | View on Reddit | 1 comments

Does anyone have experience fine tuning models QLoRA, LoRa and full training on 8x v100 32gb? * Is **Volta** still a viable option? Pytorch support looks deprecated * What models fit? * Training speed? * Thoughts on 8x v100 32GB compared to 2x RTX Pro 6000 96gb?

1 Comments

[-]

segmond@reddit

no contest, 2 pro 6000. the only reason to ever pick 8xv100 is that you got it for free and yet, based on your objective, it might still not be worth it. blackwell supports native 4-bit (NVFP4) training, that's all I gotta say on this, read up on it if you don't already know.

Reply to Post

1 Comments