Russian LLMs

Posted by RhubarbSimilar1683@reddit | LocalLLaMA | View on Reddit | 30 comments

Here's one example: [https://huggingface.co/ai-sage/GigaChat-20B-A3B-instruct](https://huggingface.co/ai-sage/GigaChat-20B-A3B-instruct) it has a MoE architecture, I'm guessing from the parameter count that it's based on qwen3 architecture. They released a paper so I don't think it's a fine tune [https://huggingface.co/papers/2506.09440](https://huggingface.co/papers/2506.09440)