netikas

New open weights models: GigaChat-3.1-Ultra-702B and GigaChat-3.1-Lightning-10B-A1.8B

Posted by netikas@reddit | LocalLLaMA | View on Reddit | 176 comments
Actual comparison between locally ran Qwen-3.6-27B and proprietary models

Posted by netikas@reddit | LocalLLaMA | View on Reddit | 72 comments
Entropy-Adaptive Finetuning

Posted by netikas@reddit | LocalLLaMA | View on Reddit | 4 comments
Which opensource models are best for (kinda) RP in French and German?

Posted by netikas@reddit | LocalLLaMA | View on Reddit | 6 comments
What’s up with vllm?

Posted by netikas@reddit | LocalLLaMA | View on Reddit | 8 comments
Why bother with RWKV/Mamba instead of decoder transformers?

Posted by netikas@reddit | LocalLLaMA | View on Reddit | 16 comments
Qwen-2-72B-Instruct in HF Chat is kinda broken?

Posted by netikas@reddit | LocalLLaMA | View on Reddit | 6 comments
Aya-23-8b for multilingual RAG

Posted by netikas@reddit | LocalLLaMA | View on Reddit | 11 comments