Fine-Tuning TranslateGemma-4B to improve bi-directional English & Welsh translations on an H200 GPU!
Posted by ufos1111@reddit | LocalLLaMA | View on Reddit | 3 comments
Open source repo: https://github.com/grctest/finetuned-gemmatranslate-cy
5% of the fine-tuning took 40 minutes and cost a couple dollars to prove the process works.
Looking forwards to Flash Attention v4 to leave beta, to test fine-tuning performance on a B200 on the cloud, probably a few months away it seems?
What languages would you train TranslateGemma to be able to translate? I was originally thinking about klingon but the available datasets seemed a bit lacking..
Orihara-Izaya@reddit
Ecclesiastic latin, koine, classic greek, aramaic...
ufos1111@reddit (OP)
Assyrian Neo-Aramaic (aii) is supported by TranslateGemma using SFT, however it could be improved for sure.
The other ones if you've got sufficient datasets i'm sure it'd work. You'd likely need to inject new language codes into the jinja template though so the translation inference allows translating to these languages.
llm_practitioner@reddit
Urdu would be an interesting one to try. Finding high-quality datasets for regional languages is usually the biggest bottleneck.