Dolphin 3.0 ! | TheaterFire

[-]

shockwaverc13@reddit

"I'm eval'ing and stuff" \-WizardLM

Reply

[-]

RemarkableRow6860@reddit

https://preview.redd.it/ob6x9a3x3ucg1.png?width=1024&format=png&auto=webp&s=af38b2146b2b98f025875a2cc7f073397de76945 me

Reply

[-]

isr_431@reddit

Dolphin Mixtral is still a beast

Reply

[-]

clduab11@reddit

Is it weird I'm already feeling nostalgia being Dolphin was the very first local model I played with? And this was ***just a few months ago?!?!*** Genuinely hype for this hahaha

Reply

[-]

denyicz@reddit

My first model was llama. I feel like vietnam veteran compared to you.

Reply

[-]

KallistiTMP@reddit

Remember GPT-2? Pepperidge farm remembers.

Reply

[-]

MorallyDeplorable@reddit

I remember feeding years of IRC logs to a markov chain and making an IRC bot to pretend to be a user. Am I literally dirt at this point?

Reply

[-]

Thick-Protection-458@reddit

And I started experimenting with LLMs at times of early GPT-3 (not GPT-3.5 aka ChatGPT). I guess I am a dynosaur than.

Reply

[-]

No-Trifle315@reddit

I remember when tried to finetune GPT-2, seens like another whole life since then.

Reply

[-]

denyicz@reddit

Remember openai claimed gpt2 was too dangerous lol

Reply

[-]

Frankly I was much less interested in language models back than, so I did not paid much attention. Like I was interested in NLP, but from engineering perspective encoder and encoder-decoder models was more useful, so I paid attention to them instead.

Reply

[-]

denyicz@reddit

Damn. Can you tell me about big bang old man?

Reply

[-]

MorallyDeplorable@reddit

I believe it was the result of DALNet collecting all the densest people in one place and creating a cosmic event.

Reply

[-]

royalsail321@reddit

And I was chatting with cleverbot checkmate

Reply

[-]

royalsail321@reddit

And I was chatting with cleverbot checkmate

Reply

[-]

luquoo@reddit

I think I'm getting nostalgia from reading your post!

Reply

[-]

MorallyDeplorable@reddit

Remember when he italicized and bolded that part? So timeless.

Reply

[-]

sorehamstring@reddit

that's my type of nostalgia

Reply

[-]

Optimistic_Futures@reddit

So excited to play Super Monkey Ball on it

Reply

[-]

Rude-Proposal-9600@reddit

I thought it was referring to an emulator too

Reply

[-]

_stevencasteel_@reddit

Not "an" emulator. "THE" emulator. Totes in the top three or five.

Reply

[-]

TacticalBacon00@reddit

I just sideloaded the Android version of Dolphin on my Quest 3S this week and I can confirm that Dolphin still slaps. Super Mario Sunshine in VR with an Xbox One controller is definitely not the intended way to play, but it was able to maintain 30FPS or above on everything outside of a loading screen.

Reply

[-]

drooolingidiot@reddit

Is Dolphin a finetune of llama or something?

Reply

[-]

CtrlAltDelve@reddit

It's intended to be an extremely uncensored model, but I don't really know how relevant that is anymore. It used to be hugely popular.

Reply

[-]

cobbleplox@reddit

Uncensored while not explicitly made for erp and generally fixing weird prompt formats to be ChatML is very, very welcome in my book. It's actually super sad how many big releases just need their shitty prompt format fixed.

Reply

[-]

skrshawk@reddit

I'd written off Qwen2.5 based on their original instruct tune. I didn't get how powerful that model really was until someone came back and tuned it off of the base.

Reply

[-]

MorallyDeplorable@reddit

What tunes are you having luck with?

Reply

[-]

skrshawk@reddit

Consider my use-case is primarily creative writing, so tunes like EVA and Mistoria (Euryale dataset for Qwen) are my daily drivers.

Reply

[-]

ThatsALovelyShirt@reddit

How would you compare those to 22B MistralSmall models?

Reply

[-]

skrshawk@reddit

I wouldn't because every time you make a jump in parameter size you go up or down a level of general intelligence and reasoning. That said, there's a clear different in smarts and in both the quality and style of writing between Qwen 72B and Largestral 123B. Both have finetunes from Drummer (one of the spicier datasets out there) and you see the underlying model quality, at that level even at tiny quants of the big one. Behemoth 1.2 at IQ2_M is a stronger model than Anubis Q4, and Behemoth at Q4 is stronger still (I usually run that on Runpod with TabbyAPI and anywhere between 4-5bpw). I don't consider any model under 70B for what I do, because smaller models just don't handle multiple characters and tracking their thoughts, dialogue, and actions separately very well. You can have a good time in a 1:1 eRP scenario with the smaller models though and no doubt a lot of people do.

Reply

[-]

Hoodfu@reddit

Going one further, the issue popped up that when the new llama 3's came out, they were such a leap from the old mistral models that what he was training llama on to unalign it actually made it dumber. That's when people started going the abliterated route.

Reply

[-]

a_beautiful_rhind@reddit

QvQ 72b dolphin.

Reply

[-]

sassydodo@reddit

honestly I'd love to see 8b reasoning model

Reply

[-]

Competitive_Ad_5515@reddit

!remindme 2 days

Reply

[-]

RemindMeBot@reddit

I will be messaging you in 2 days on [**2025-01-02 03:37:31 UTC**](http://www.wolframalpha.com/input/?i=2025-01-02%2003:37:31%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/1hpqcgg/dolphin_30/m4mzfjd/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F1hpqcgg%2Fdolphin_30%2Fm4mzfjd%2F%5D%0A%0ARemindMe%21%202025-01-02%2003%3A37%3A31%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201hpqcgg) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

Reply

[-]

TroyDoesAI@reddit

Eric Hartfords Dolphin is the main reason I went so heavy into LLMs and generative ai. I am excited to see the any updates.

Reply

[-]

Nandakishor_ml@reddit

where are you dolphin

Reply

[-]

Conscious_Nobody9571@reddit

We're not ready... 😭

Reply

[-]

MustBeSomethingThere@reddit

Are fine-tunings by a single person still a thing?

Reply

[-]

skrshawk@reddit

In the creative writing and RP/ERP world, I'd say there's more people who do their own thing, even FFTs, than there are groups collaborating more than loosely on what goes in their datasets.

Reply

[-]

Many_SuchCases@reddit

I guess so, I used to be excited for the original dolphin's and other popular finetunes at the time, but I kind of stopped caring after a while. I think part of that is because if we look at these general finetunes where the goal is to achieve higher scores or better output in general, I feel like it's a bit of a lost cause. It's also really hard for me to take it serious when a single person claims to achieve a 5-10 point increase in benchmarks that somehow a billion(s) dollar company couldn't do. In the beginning there was *some* hope that maybe by training it on some better OpenAI data that we could improve it. And it might have worked a bit to some extend, but at this point these companies have invested so much money that if a single person who rents a few A100's could make that much of a difference, they would have done that by themselves. Think about how big Meta's datacenter is for example. They could literally run the same experiment in 2 minutes over and over and over. Not just that, but, they have a lot more advanced techniques then "just show it some different data" at this point. As far as uncensored models go, we now also have uncensored models in pretty much every niche and then in every niche of those niches. Not to mention, generally uncensored models in most areas (finetuned on external data) and then also abliterated models for those who prefer that. So personally, I'm just not interested in these anymore, but to each their own.

Reply

[-]

NoLifeGamer2@reddit

She eval on my benchmark until I release

Reply

[-]

smooshie@reddit

[EXTREMELY LOUD INCORRECT BUZZER]

Reply

[-]

TheLogiqueViper@reddit

Test time inference? After launch of test time inference i want open source models to implement it so badly

Reply

[-]

s-kostyaev@reddit

Try QwQ

Reply

[-]

Sky_Linx@reddit

I’m not familiar with Dolphin as I only recently started to run local models. Is it good and how does it compare to Qwen 2.5 for example?

Reply

[-]

martinerous@reddit

Dolphin are finetunes of the original models. As with every finetune, it can be a hit or miss. I usually check the base models first and go for finetunes only if I need something specific.

Reply

[-]

Zemanyak@reddit

Brings up good memories. I hope Dolphin 3.0 will be as good as 2.0 when it was released.

Reply

Reply to Post

54 Comments