Fine-tuning llms on dgx spark from nvidia webpage
Posted by siegevjorn@reddit | LocalLLaMA | View on Reddit | 2 comments
https://blogs.nvidia.com/blog/rtx-ai-garage-fine-tuning-unsloth-dgx-spark/
Hi I'd like to discuss the numbers pertaining dgx spark performance from "How to Fine-Tune an LLM on Nvidia GPUs With Unsloth".
### Llama 3.3 70B
- Method: Qlora
- Backend: Pytorch
- Config:
- Sequence length: 2,048
- Batch size: 8
- Epoch: 1
- Steps: 125FP4
- Peak Tokens/ Sec: 5,079.04
If you assume training on 100M tokens then 100M/5079/3600 ~ 5.46 hours.
It doesn't seem to bad for what is worth, to have a mini machine that could fine tune a llama 3.3 70b in qlora. Is there a catch? Is this realistic number?
2 Comments
Extension-Bass-2338@reddit
Tyme4Trouble@reddit