My old boss made a terrible decision based on AI benchmarks and turns out I was doing the same thing

Posted by Wise_Slice6303@reddit | ExperiencedDevs | View on Reddit | 25 comments

My old boss fired his entire frontend team last month cause he saw some demos and thought one backend dev could cover everything. Well 3 weeks later Im cleaning up the mess, site broken on mobile, zero accessibility, nobody knowing how anything works.

Watching him make that call based on numbers he didnt understand stuck with me. Turns out I was doing the same thing when I picked my own coding model. Ive been on GLM since 4.7, switched cause it was cheaper and worked fine. When GLM 5.1 came out it felt like a real upgrade so i stuck with it.

GPT-5.5 came out the other day so i checked SWE-Bench Pro and its 58.6 vs 58.4 for GLM-5.1, basicaly the same score. Both numbers published by the companies themselves and the pricing gap between them keeps shrinking too.

At this point idk if Im on GLM 5.1 cause its better or just cause its what i know. Same trap my old boss fell into just from the other side. Running my own tests this week cause company benchmarks mean about as much as self reported experience on a resume.