Why no talk about Medium (size) Language Models? 70-200B

Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 11 comments

People here brought SLM topic time to time(Ex: Is SLM the future?). But never seen anyone brought Medium (size) Language Model.

The definition of both SLM(Small Language Model) & MLM(Medium Language Model) changes over the time. Right now some already calling 20-35B models as SLMs. By this defination, I guess 70-150B(Max 200B) falls under Medium Language Models. 201-500B is Big & 501B-1T+ is Large Models.

List of Medium (size) Language Models(Popular & Recent ones from HF):

Only Llama-3.2-90B there in 80-100B range.

Only Mixtral-8x22B there in 126-150B range.

Only Step-3.5-Flash there in 150-200B range. 150B is a good size, Q4 comes in 75GB which is good for 64/72/80GB VRAM.

Model creators could consider the above ranges for their upcoming medium size models.

I think many would prefer to see more new Medium (size) Language Models(70-200B) than Large 1T models. Like people who's with 96GB VRAM(4x 3090s or 3x 4090s) could run 200B models @ Q4 with Offloading(System RAM), -ncmoe, etc.,

(BTW I didn't forget models like MiniMax-M2.5, Qwen3-235B-A22B & Qwen3.5-397B .... Those falls under Big category, maybe separate thread is better for that. or MiniMax-M2.5 & Qwen3-235B-A22B belong to above list as it's sitting near to 200B range?)

(Previously I wished for more tiny/small models as my current laptop has only 8GB VRAM. But soon I'm getting new rig with 72-96GB VRAM so now expecting more medium size models)

So what are your expectations from Model creators on upcoming models?