robotphilanthropist

Best Local LLMs - 2025

Posted by rm-rf-rm@reddit | LocalLLaMA | View on Reddit | 219 comments

2025 Open Models Year in Review

Posted by robotphilanthropist@reddit | LocalLLaMA | View on Reddit | 27 comments

robotphilanthropist@reddit (OP)

It's definitely an underrated model. We tend to weigh top-end performance and how many releases they do too. But yeah OpenAI was top of our specialists, almost in noteworthy category, and maybe just personal bias for not putting it in honorable mentions. Its a super popular model.

2025 Open Models Year in Review

Posted by robotphilanthropist@reddit | LocalLLaMA | View on Reddit | 27 comments

Olmo 3.1 32B Think & Instruct: New Additions to the Olmo Model Family

Posted by Dear-Success-1441@reddit | LocalLLaMA | View on Reddit | 30 comments

Olmo 3.1 32B Think & Instruct: New Additions to the Olmo Model Family

Posted by Dear-Success-1441@reddit | LocalLLaMA | View on Reddit | 30 comments

Olmo 3.1 32B Think & Instruct: New Additions to the Olmo Model Family

Posted by Dear-Success-1441@reddit | LocalLLaMA | View on Reddit | 30 comments

robotphilanthropist@reddit

I personally spent hours in regex’s to do this. It removes most of the samples, but across billions of tokens in pretrain and post train it’s very hard to do.  The problem is more of a need then to generate data about your identity rather than patching the long tail of regex’s

Olmo 3.1 32B Think & Instruct: New Additions to the Olmo Model Family

Posted by Dear-Success-1441@reddit | LocalLLaMA | View on Reddit | 30 comments

Olmo 3.1 32B Think & Instruct: New Additions to the Olmo Model Family

Posted by Dear-Success-1441@reddit | LocalLLaMA | View on Reddit | 30 comments

robotphilanthropist@reddit

working on it for the new version. We changed how we handled system prompts in training and didn't have an in loop eval for this. It's high on my list to fix in the new year :)

Ai2 just announced Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use

Posted by Nunki08@reddit | LocalLLaMA | View on Reddit | 190 comments

Don’t sleep on The Allen Institute for AI (AI2)

Posted by dontbanana@reddit | LocalLLaMA | View on Reddit | 46 comments

robotphilanthropist@reddit

For one, RL finetuning like this has been known in industry for years, just not really talked about. We were ahead of the curve on bringing it back into conversation, but I wouldn't say DeepSeek "copied" RLVR.

Don’t sleep on The Allen Institute for AI (AI2)

Posted by dontbanana@reddit | LocalLLaMA | View on Reddit | 46 comments

robotphilanthropist@reddit

We obviously know that our Tülu 3 recipe is not a reasoning model, but early experiments that worked very well with the same formulation as reasoning models. We're going to release full reasoning models in the future, good things take time. Both Instruct models and reasoning models use this type of RL.

Open models wishlist

Posted by hackerllama@reddit | LocalLLaMA | View on Reddit | 238 comments

OLMo 2 Models Released!

Posted by Many_SuchCases@reddit | LocalLLaMA | View on Reddit | 113 comments

OLMo 2 Models Released!

Posted by Many_SuchCases@reddit | LocalLLaMA | View on Reddit | 113 comments

OLMo 2 Models Released!

Posted by Many_SuchCases@reddit | LocalLLaMA | View on Reddit | 113 comments

Tülu 3 -- a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 42 comments

robotphilanthropist@reddit

Yeah lemme work on this, will add it to the paper. DPO is pretty quick because fewer tokens. Like 12 hours or less on 2 nodes at 8B, \~24 hours at 4 nodes on 70b. RL can really be a long time depending how long you want it to run.

Tülu 3 -- a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 42 comments

Tülu 3 -- a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 42 comments

Tülu 3 -- a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 42 comments

Tülu 3 -- a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms

Posted by Xhehab_@reddit | LocalLLaMA | View on Reddit | 42 comments

What happened to Llama 3.2 90b-vision?

Posted by TitoxDboss@reddit | LocalLLaMA | View on Reddit | 42 comments

OLMoE - a fully open source sparse MoE with only 1 billion active parameters

Posted by Aaaaaaaaaeeeee@reddit | LocalLLaMA | View on Reddit | 36 comments

robotphilanthropist@reddit

Some general comments on what you can expect from post-training behavior. 1. Most of the data is single turn instruction following. We want to make a v2 that is better at multi-turn. 2. A moderate focus on code/reasoning but we can still do more. 3. Not that much on system prompts / roleplay, so curious what people find. 4. Working on verifiable instruction following (IFEval). Isn't as good as Llama 3.1 type models, but much better than previous OLMos

OLMoE - a fully open source sparse MoE with only 1 billion active parameters

Posted by Aaaaaaaaaeeeee@reddit | LocalLLaMA | View on Reddit | 36 comments

OLMoE - a fully open source sparse MoE with only 1 billion active parameters

Posted by Aaaaaaaaaeeeee@reddit | LocalLLaMA | View on Reddit | 36 comments

robotphilanthropist@reddit

We found that OLMoE was only about 20-40% faster to fine tune than OLMo 7B (dense model). I suspect some of that was from rough initial implementations in HF ecosystem for fine-tuning. I didn't look closely at utilization / batch size.

Llama3.1 models are "fake distillations" - this should be publicly addressed

Posted by kindacognizant@reddit | LocalLLaMA | View on Reddit | 88 comments

robotphilanthropist@reddit

The problem is that colloquially distillation covers two things 1. the technical teacher-student distillation 2. any learning from a more powerful model via synthetic data Both are popular today, the first is the second definition, the second is what Zuck meant.