What happened to 1.58bit LLMs?

Posted by Sloppyjoeman@reddit | LocalLLaMA | View on Reddit | 61 comments

Last year I remember them being super hyped and largely theoretical. Since then, I understand there’s a growing body of evidence that larger sparse models outperform smaller denser models, which 1.58bit quantisation seems poised to drastically improve I haven’t seen people going “oh, the 1.58bit quantisation was overhyped” - did I just miss it?