Re. what ever happened to Cohere’s Command-A series of models?
Posted by nick_frosst@reddit | LocalLLaMA | View on Reddit | 27 comments
Hey everyone, Nick Frosst here from Cohere. A few months ago Aidan (my cofounder) left a comment in here about our Command series and how we were working on some more powerful, open-weights models behind the scenes. We just launched Command A+ and we wanted to share it with you guys.
TLDR is we built a really efficient model. It’s our first MoE model, which is exciting. There’s obvs work to do on top-line performance but it’s easily looking like one of the fastest and most responsive models in our category. We also pulled off some incredible quantization work so it runs really well on even 1 or 2 GPUs.
Like with R7B, we really prioritized making the model practical, so smaller teams and devs could realistically use it to build the kind of agents we ship for our platform customers. That’s also why it’s under Apache 2.0. Just total, near unfettered access to a pretty awesome model.
We’re enterprise-first but honestly, we get so much out of our open-source community that makes us more innovative and creative. The feedback you give will almost certainly influence how we think about models and product going forward…... as it already has here from getting called out the last time haha.
So, don’t hold back. Share your thoughts, your projects, whatever. You can see the full details here https://cohere.com/blog/command-a-plus We appreciate you :)
-Ellary-@reddit
Original Command R+ was pretty legendary for the time.
Especially fore creative work and resource planing, for enterprise.
Southern_Sun_2106@reddit
I remember that model. Fond memories.
de4dee@reddit
welcome back. you had one of the best models back then. https://huggingface.co/CohereLabs/c4ai-command-r-plus
hope you continue awesome work!
rpkarma@reddit
> all while running on as little as two H100 GPUs
I know this is objectively true, but it makes me giggle lol
Though it does mean it could fit on two Sparks?
1ncehost@reddit
Its a 200B model so it can fit on one spark at Q3 or Q4
DunderSunder@reddit
300 tokens/s how?!
1ncehost@reddit
Cool of you to stop by Nick. I like this type of outreach and congrats on the new model release.
The lack of standard benchmarks and any comparison to the current SOTA in this size class (imo minimax m2.7 and mimo v2.5) makes it seem like your new model isn't competitive in quality. I doubt you'll get much popularity for that reason. Anything you can say about that?
nick_frosst@reddit (OP)
You can see all the benchmarks on artificial analysis :) it’s got a 37 intelligence score which I think is a little lower than my experience using it would have had me guess
grumd@reddit
I have a feeling that some models are "benchmaxxing" as they say, training for higher benchmark numbers, but it falls apart in tasks out of the box.
MerePotato@reddit
Qwen's already been proven to be doing this by accident, makes me wonder what other labs are falling prey to it
1ncehost@reddit
I posted the pic above with the results he mentioned, but I just wanted to say that I agree that what you're saying seems true with those two models in my testing of them. Could certainly be the case that Cohere's model is more competitive than it appears.
LagOps91@reddit
yeah that score isn't the most reliable. in addition minimax did a lot of post training on their model, so there is potential to improve the score of a new model.
1ncehost@reddit
Awesome ill check it out 🙏
LoveMind_AI@reddit
Nick, longtime fan of Command A and R7B for creative tasks, and nearly gave up waiting for a major new model or one with permissive license so this is a pleasant surprise. Benchmarks aren’t everything, so I’ll give this one a strong shot. Really nice of you to post here, and glad to hear Cohere is aware of non-enterprise users.
synn89@reddit
Really glad to see Cohere releasing models again. We need companies like Cohere and Mistral out there plugging away.
noctrex@reddit
Congratulations for the release. It's always nice to see new models, from other players than the big labs.
Are you planning to also release smaller models this year that can be run on consumer cards? Like you did with the older command-r7b?
nick_frosst@reddit (OP)
Stay tuned 👀
noctrex@reddit
👀 👀
pineapplekiwipen@reddit
saw your interview on prof g markets! i wonder if you're considering an ipo at all at this stage.
this model makes me wish i opted for the 256gb mac studio
killerstreak976@reddit
I'm really happy that MoE models are getting more and more attention as of late. This looks really cool (though kinda un-runnable for me personally). Models like 26ba4b, 30ba3b, etc are so cool because they can be run on an older laptop with no dedicated gpu, which i think is a big deal since ideally expensive hardware shouldn't be a barrier towards access to knowledge and privacy. I'm pretty sure that scales up, and even if I cant personally run it, I'm excited to check it out through other people's observations on here!
Swolnerman@reddit
This is awesome! Can I ask what the specs of the machine you are running the above demo was?
nick_frosst@reddit (OP)
The demo above was from our api!
Swolnerman@reddit
Looking forward to trying it out!
Kiansjet@reddit
Forgive me for being skeptical about the technical ability of an individual who seems to be using a screen recording of a Google meet video call as their facecam
Besides that good on you for doing an open weight model
OnkelBB@reddit
Thanks folks! Happy to see more open weight models.
Leflakk@reddit
Pretty cool this surprise, happy to see cohere back in the game, diversity is important
LienniTa@reddit
gguf wen >_<