What are your favorite LLMs for translation/docuement work?

Posted by AdventurousFly4909@reddit | LocalLLaMA | View on Reddit | 5 comments

I am currently working on a system to translate books/web novels. I got a working prototype, but now I am looking into optimizing it. I actually quite liked working on it because you are trying to always keep it busy and never wait for something to finish. It's a pretty fun programming challange for learning async and concurrency.

So I am wondering what your favorites models are for translation, summarization and etc. I am currently running gemma 26B 4bit on vllm and it's okay, though I haven't tried 3.6 27B or 3.6 35B so I don't have much to compare against.

Are there any models fine tuned for this, maybe those role playing ones? I don't really know, so I want to hear your thoughts.

[-]

BitGreen1270@reddit

Not OP but is there an IDE that I can use to edit or create documents using an llm? Like highlight a paragraph and ask it to redraft? Similar to Google docs (paid tier).

[-]

ttkciar@reddit

You are already using the one I came here to recommend, Gemma-4-26B-A4B-it.

I had been using Phi-4 for quick translation, but have switched to Gemma4-26B, as it is both faster and more accurate.

IMO you would be best served batching several chapters for best inference throughput, one chapter per inference, if you have the VRAM. Batched inference takes longer per inference, but your overall tokens/second rate increases.

[-]

UnWiseSageVibe@reddit

I have been translating some Japanese stuff to English for personal use and Gemma-4-26B-A4B-it is simply amazing, I have no issues with the translations.

[-]

AdventurousFly4909@reddit (OP)

I am doing that. Currently getting somewhere between 500 and 1000 tokens/s on my 3090 depending on how many concurrent requests there are. Sometimes there 30 requests running parallel in vllm and other times 10 and 8 waiting. I no idea why that is, it might be not enough vram for the context or anything else idk.

I thought maybe 3.6 27B would be better since with a moe there might be experts holding info that might not be activated. But I heard that Gemma models are better at translation and other document tasks.

[-]

markole@reddit

For low resource languages, Gemma 4 is great. Mistral ones were also good but the largest Gemma 4 is the way to go. Too bad they decided to not release the 122B variant.