netikas
-
New open weights models: GigaChat-3.1-Ultra-702B and GigaChat-3.1-Lightning-10B-A1.8B
Posted by netikas@reddit | LocalLLaMA | View on Reddit | 176 comments
-
Actual comparison between locally ran Qwen-3.6-27B and proprietary models
Posted by netikas@reddit | LocalLLaMA | View on Reddit | 72 comments
-
Entropy-Adaptive Finetuning
Posted by netikas@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Which opensource models are best for (kinda) RP in French and German?
Posted by netikas@reddit | LocalLLaMA | View on Reddit | 6 comments
-
What’s up with vllm?
Posted by netikas@reddit | LocalLLaMA | View on Reddit | 8 comments
-
Why bother with RWKV/Mamba instead of decoder transformers?
Posted by netikas@reddit | LocalLLaMA | View on Reddit | 16 comments
-
Qwen-2-72B-Instruct in HF Chat is kinda broken?
Posted by netikas@reddit | LocalLLaMA | View on Reddit | 6 comments
-
Aya-23-8b for multilingual RAG
Posted by netikas@reddit | LocalLLaMA | View on Reddit | 11 comments