Upcoming vllm Mistral Large 3 support Posted by brown2green@reddit | LocalLLaMA | View on Reddit | 9 comments
[-] ilintar@reddit Interesting, so new Mistral is DeepSeek architecture. [-] Iory1998@reddit Is it MOE? Man, I still got Mixtral.
[-] Long_comment_san@reddit Mistral are amazing models. I hope they can train as well as they did previous models.
[-] random-tomato@reddit Incredible, a new Mistral Large AND it's MoE!?!?!? [-] -p-e-w-@reddit Every large model is going to be MoE from now on. That competition has been settled pretty thoroughly.
[-] -p-e-w-@reddit Every large model is going to be MoE from now on. That competition has been settled pretty thoroughly.
[-] brown2green@reddit (OP) Add Mistral Large 3 #29757 It looks like it's based on the DeepSeek V2 architecture. [-] MitsotakiShogun@reddit EagleMistralLarge3Model(DeepseekV2Model) (line) and config_dict["model_type"] = "deepseek_v3" (line)?
[-] MitsotakiShogun@reddit EagleMistralLarge3Model(DeepseekV2Model) (line) and config_dict["model_type"] = "deepseek_v3" (line)?
[-] jacek2023@reddit So there is new 8B and new large but I want something between 32B and 120B, let's hope it will be next [-] TheLocalDrummer@reddit There’s also a 3B, I think.
ilintar@reddit
Interesting, so new Mistral is DeepSeek architecture.
Iory1998@reddit
Is it MOE? Man, I still got Mixtral.
Long_comment_san@reddit
Mistral are amazing models. I hope they can train as well as they did previous models.
random-tomato@reddit
Incredible, a new Mistral Large AND it's MoE!?!?!?
-p-e-w-@reddit
Every large model is going to be MoE from now on. That competition has been settled pretty thoroughly.
brown2green@reddit (OP)
It looks like it's based on the DeepSeek V2 architecture.
MitsotakiShogun@reddit
EagleMistralLarge3Model(DeepseekV2Model)(line) andconfig_dict["model_type"] = "deepseek_v3"(line)?jacek2023@reddit
So there is new 8B and new large but I want something between 32B and 120B, let's hope it will be next
TheLocalDrummer@reddit
There’s also a 3B, I think.