DeepSeek-r1-0528 in top 5 on new SciArena benchmark, the ONLY open-source model

Posted by entsnack@reddit | LocalLLaMA | View on Reddit | 42 comments

DeepSeek-r1-0528 in top 5 on new SciArena benchmark, the ONLY open-source model

Post: https://allenai.org/blog/sciarena

Allen AI puts out good work and contributes heavily to open-source, I am a big fan of Nathan Lambert.

They just released this scientific literature research benchmark and DeepSeek-r1-0528 is the only open-source model in the top 5, sharing the pie with the like of OpenAI's o3, Claude 4 Open, and Gemini 2.5 Pro.

I like to trash DeepSeek here, but not anymore. This level of performance is just insane.