TheaterFire

DeepSeek V3 is the gift that keeps on giving!

Posted by indicava@reddit | LocalLLaMA | View on Reddit | 182 comments

DeepSeek V3 is the gift that keeps on giving!

Reply to Post

182 Comments

freecodeio@reddit

How much would this cost in gpt4o
View on Reddit #45477573

indicava@reddit (OP)

I had ChatGPT do the math for me lol... It estimates around $1,400 USD.
View on Reddit #45477736

lessis_amess@reddit

get something else to do the math, this is wrong lol
View on Reddit #45478960

indicava@reddit (OP)

So for about 180M input tokens and 90M output tokens, what did your calculation come to?
View on Reddit #45479705

lessis_amess@reddit

obviously you are doing a ton of cache hits to pay 30usd for this amount of tokens. why are you assuming you would not hit that with oai? The simple heuristic is that at its most expensive, deepseek is 40x cheaper for output (10x cheaper for input)
View on Reddit #45480702

indicava@reddit (OP)

the DeepSeek console doesn't provide a simple way to test this. But looking at one day, I'm about at 50% cache hits. https://preview.redd.it/niycmr7makce1.png?width=558&format=png&auto=webp&s=a2792b6302312b8dc653e4e1a6643f49e1573705
View on Reddit #45480921

Quiet_Debate_651@reddit

How do you do to have so much cache hit?
View on Reddit #61116052

SynthSire@reddit

The export to .csv contains it as a breakdown, and allows you to use formulas to see the exact cost breakdown. After seeing this post I have given it a go for dataset generation and am very happy with its output at a cost of $8.41 for what gtp4o for similar output would cost $293.75
View on Reddit #45548918

dubesor86@reddit

Seems about right. This aligns with my cost effectiveness calculations https://dubesor.de/benchtable#cost-effectiveness It depends how long your context carry over is, but either way 4o would be vastly more expensive. Even in best case scenario for 4o, it would be at least 40x more expensive.
View on Reddit #45480826

RageshAntony@reddit

What's that "Minimum Performance:" slider?
View on Reddit #45697553

dp3471@reddit

awesome site by the way!
View on Reddit #45483828

indicava@reddit (OP)

Very cool data and layout! Thanks for sharing.
View on Reddit #45481006

freecodeio@reddit

Is this all input tokens or how are they split? Cause with real math it's somewhere between $682 - $2730
View on Reddit #45478943

indicava@reddit (OP)

the DeepSeek console doesn't provide an easy breakdown for this. But I'm estimating about a 2/3 to 1/3 split of Input vs Output tokens.
View on Reddit #45479029

Mickenfox@reddit

Yeah but now compare it to gemini-2.0-flash-exp (just don't look at the rate limits)
View on Reddit #45491484

indicava@reddit (OP)

The latest crop of Gemini models are seriously impressive (exp-1206, 2.0 flash, 2.0 flash thinking). But like your comment alluded to, the rate limits are a joke. For my use case they weren’t even an option. Hopefully when they become “GA” google will ease up on the limits because I really think they have a ton of potential.
View on Reddit #45498330

Alexs1200AD@reddit

2000 Is this a small number of requests per minute?
View on Reddit #45871465

AppearanceHeavy6724@reddit

Not for prose. they suck at fiction, esp 1206. Mistral is far better.
View on Reddit #45508721

cgcmake@reddit

What does GA mean?
View on Reddit #45499343

indicava@reddit (OP)

lol I’m a software guy, GA usually means “Generally Available”. I have no idea if that’s the best term for what I meant, which is: when they leave their “experimental” stage.
View on Reddit #45499548

raiffuvar@reddit

what limits?
View on Reddit #45506885

Mickenfox@reddit

The limit through the API is 10 requests per minute.
View on Reddit #45510723

RegisteredJustToSay@reddit

You mean if you use the free one? Gemini model APIs advertise 1000-4000 requests per minute for pay-as-you-go depending on the model and I've never hit limits, but I'm not sure if there's some hidden limit you're alluding to which I've somehow narrowly avoided. I'm just not sure we should be comparing paid api limits with free ones.
View on Reddit #45562846

raiffuvar@reddit

oh.. probably indians can handle just that much.
View on Reddit #45525065

A_Dragon@reddit

How does v3 compare to o1?
View on Reddit #45480170

torama@reddit

IMHO it compares on equal footing to sonnet or o1 for coding BUT it lacks in context window severly. So if your task is short it is wonderful. But if I give it a few thousand lines of context code it looses its edge
View on Reddit #45488300

freecodeio@reddit

what model doesn't lose its edge with long 65k+ token prompts
View on Reddit #45491833

Zeitgeist75@reddit

Sonnet 3.5 has been quite good at answering complex questions with entire books as context for me so far.
View on Reddit #46767682

Few_Painter_5588@reddit

Google Gemini
View on Reddit #45503963

BoJackHorseMan53@reddit

Deepseek has 128k context, same as gpt-4o
View on Reddit #45490310

OrangeESP32x99@reddit

It’s currently limited to half that unless you’re running local.
View on Reddit #45490704

BoJackHorseMan53@reddit

Or using fireworks or together API :)
View on Reddit #45493927

OrangeESP32x99@reddit

Yeah I just meant official app and api has the limit. I assume it’ll be gone when they raise the prices.
View on Reddit #45495192

torama@reddit

I am using a web interface for testing it and I think that interface has limited context but not sure
View on Reddit #45490657

A_Dragon@reddit

I meant with coding.
View on Reddit #45494821

CleanThroughMyJorts@reddit

I've been running a few agent experiments with Cline, giving simple dev tasks to o1, sonnet 3.5, Deepseek, and gemini. If I were to rank them based on how well they did: (best) Claude -> o1 -> Deepseek -> Gemini (worst) Here's a cost breakdown of 1 of the tasks that they did: Basically they had to setup a dev environmnent, read the docs on a few tools (they are new or obscure so outside training data; by default asking LLMs to use those tools they either use the old API or hallucinate things) and create a basic workflow connecting the three tools and write tests to ensure they work. 1. **Claude 3.5 Sonnet** * First to complete * Tokens: 206.4k * Cost: $0.1814 * Most efficient successful run * Notable for handling missing .env autonomously 2. **OpenAI O1-Preview** * Second to complete * Tokens: 531.3k * Cost: $11.3322 * Highest cost but clean execution 3. **DeepSeek v3** * Third to complete * Tokens: 1.3M * Cost: $0.7967 * Higher token usage but cost remained reasonable due to lower pricing 4. **Gemini-exp-1206** * DNF * Tokens: 2.2M * Multiple hints needed * Status: Terminated without completing setup Of the 3 that succeeded, deepseek had the most trouble; it needed several tries, kept making mistakes and not understanding what its mistakes were. o1 and Claude were better at self-correcting when they got things wrong. Note: cost numbers are from usage via openrouter, not their respective official apis
View on Reddit #45567367

Nervous-Positive-431@reddit

May I ask, how many requests per day does that translates to? I am kind of a newbie here! Also, will the previous conversation/context be added into the total used tokens? Or it is generally used with a single fully detailed request without forwarding the past conversation?
View on Reddit #45478388

Aware_Sympathy_1652@reddit

Asking it to summarize quantum mechanics cost 250 tokens
View on Reddit #46575615

Utoko@reddit

many many many. The only way you get to these numbers is with Agents. Most likely big code projects. Request is not a great measurement. Normal short questions are 500 Token. A request in your codebase can take 100K Tokens.
View on Reddit #45479341

pol_phil@reddit

Only way is with Agents? 😛 With such low prices I was thinking of building synthetic data based on whole corpora!
View on Reddit #45494608

WeWantTheFunk73@reddit

What formula do you use to estimate number of words based on tokens?
View on Reddit #45638418

pol_phil@reddit

Well, there is not a single golden formula. OpenAI tells you that "1 word = 1.25 tokens" which is more or less true for common English texts. But, depending on the model's tokenizer, how specialized a domain is, or for other languages, 1 word can amount to anything between 1.5-7 tokens.
View on Reddit #46273071

frivolousfidget@reddit

How do you go to generatw synthetic data? Any prompts or software for that?
View on Reddit #45509912

BattleRepulsiveO@reddit

you can automate over real data and ask the AI to summarize or format it in a better way. For example, there are tv scripts online which you can ask the AI to turn the script into a summary.
View on Reddit #45551971

-Django@reddit

It's highly task dependent, but you generally give an LLM your labels/label distribution and task it with creating the input data. e.g. if you're making an NLP hospital readmission model, you'd find the prevalence of the event from literature, let's say its 10%, then you'd task the model to generate 900 notes for patients that WONT be readmitted and 100 notes where the patient WILL be readmitted.
View on Reddit #45540542

59808@reddit

Out of interest - which agents can handle that kind of amounts of tokens?
View on Reddit #45494965

l33t-Mt@reddit

It might not just be one.
View on Reddit #45511910

Nervous-Positive-431@reddit

Wow...that is dirt cheap. Appreciated mate!
View on Reddit #45481023

1ncehost@reddit

When I use dir-assistant, it sends an entire context worth of a code repo to the LLM for every request. If I use Deepseek v3 (128k context size) and make a query every 5 minutes, that's over 10 million tokens per day.
View on Reddit #45607555

gooeydumpling@reddit

If you’re coding heavily then you could easily clear that number, even without agents. Cline for example, if you make it do stuff in vscode, can spend 1M tokens in literally minutes
View on Reddit #45573428

Stellar3227@reddit

I tried to give a better estimate than the first reply but they're right: it's so many and really to answer, lol. I estimated 100k tokens MAX per day when I'm using an AI all day. To each 274 million, that'd be 2,740 *days!* I.e. 7.5 years of daily heavy use. However, that number would be reached much faster with long context, like uploading and discussing books. So it really depends.
View on Reddit #45541568

Pvt_Twinkietoes@reddit

It is about 10mil tokens per day. 128k maximum window size. That means minimum 78 requests per day. Not sure what OP uses it for, but it is ALOT.
View on Reddit #45535540

Substantial-Thing303@reddit

Do you guys still see a difference between Deepseek v3 from OpenRouter and directly through their API? I only use OpenRouter, and V3 is always making garbage code. Super messy, no good understanding of subclasses, unmaintainable code, etc. Past 10k tokens it ignores way too much code and only works ok if I give it less than 4k tokens, but still inferior to Sonnet. Sonnet 3.5 feels 10x better while working with my codebase.
View on Reddit #45582342

AriyaSavaka@reddit

Probably because they're using a low quant on their cluster. DeepSeek on official API works great for me.
View on Reddit #46145488

mycall@reddit

Does DeepSeek analyze and harvest the tokens the chat completions contexts? They might get some juicy data for next-gen use cases (or future training).
View on Reddit #45487176

BoJackHorseMan53@reddit

OpenAI does for sure.
View on Reddit #45490236

BGFlyingToaster@reddit

Not if you use it inside of Azure OpenAI Services
View on Reddit #45509820

amdcoc@reddit

Then azure owner gets it.
View on Reddit #45661543

BGFlyingToaster@reddit

That would be you
View on Reddit #45664083

BoJackHorseMan53@reddit

Same with Deepseek, if you run it locally or host on Azure ;)
View on Reddit #45544741

mrjackspade@reddit

Because if OpenAI does it, that makes it okay.
View on Reddit #45511374

BoJackHorseMan53@reddit

I don't see you complaining about data harvesting when OpenAI releases a new model.
View on Reddit #45544619

indicava@reddit (OP)

afaik their ToS state they use customer data for training future models.
View on Reddit #45487347

RageshAntony@reddit

What's the limit for DeepSeek V3 free chat ?
View on Reddit #45660406

dairypharmer@reddit

Correct. Their hosted chat bot is even worse, they claim ownership over all outputs.
View on Reddit #45490315

raiffuvar@reddit

Every model claims ownership of output. And restrict from training other models with this output.
View on Reddit #45493605

lolzinventor@reddit

Am i doing it right? https://preview.redd.it/tsqv8ydizjce1.png?width=1224&format=png&auto=webp&s=4121a4d9c4db55f53a3a5d70952bfbd400a09bce
View on Reddit #45477521

Many_SuchCases@reddit

🔎🧐 it appears you have the day off from work/school every Wednesday.
View on Reddit #45480522

lolzinventor@reddit

Not sure,  it could be those days i leave the syngen processes undisturbed, allowing them to get on with processing tokens.  ive lowered the thread count recently.
View on Reddit #45481154

Enough-Meringue4745@reddit

What is this syngen
View on Reddit #45490094

MatlowAI@reddit

Synthetic dataset creation?
View on Reddit #45490503

lolzinventor@reddit

yeah.
View on Reddit #45494664

-Django@reddit

What kind of task are you making the dataset for? just curious and interested in learning about synthetic data :-)
View on Reddit #45540254

lolzinventor@reddit

Attempting to make the  LLM reason.
View on Reddit #45554063

MatlowAI@reddit

Speaking of synthetic data creation... Something I'd love to see is if we can steer reasoning into scientific logical leaps... creating training data sets for things like I shorted out a battery and it sparked and glowed red, gas lamps glow too, they are crummy because x, I wonder if this can replace gas lamps and then scenarios on observation and hypothesis and experimental design all the way down the tech tree for power requirments, failure modes, oxidation fix, thermal runaway fix, etc until we get to tungsten filament in a vacuum chamber... for various different inventions. Any thoughts on tips for how to generate quality synthetic data here given enough good examples manually created? They tend to not be able to think of these connections from my cursory look at it and I'd hate to have to manually do this.
View on Reddit #45587567

lolzinventor@reddit

They are really bad at this kind of thing, as well as fault finding and deduction.
View on Reddit #45594798

poetic_fartist@reddit

What do you do sir for a living and can I start learning and experimenting with llms on 3070 laptop ?
View on Reddit #45513440

Many_SuchCases@reddit

I see. My usage spikes on Friday, apparently. I wonder if there are days where inference is faster due to different amounts of concurrent users.
View on Reddit #45481735

superfsm@reddit

I noticed this, yes.
View on Reddit #45489819

Yes_but_I_think@reddit

Don't do this. Please. Let the needy use this. Go for O1. I think you can.
View on Reddit #45563546

Down_The_Rabbithole@reddit

Very curious about the datasets you're creating.
View on Reddit #45515824

lolzinventor@reddit

just learning, probably mostly wasted effort and tokens.
View on Reddit #45519200

Mediocre_Tree_5690@reddit

What kind of synthetic data sets are you creating and what do you use them for?
View on Reddit #45510478

FriskyFennecFox@reddit

That's a huge amount of requests. Coding?
View on Reddit #45492277

lolzinventor@reddit

dataset generation.
View on Reddit #45500861

indicava@reddit (OP)

Hell yea, yo go brother!
View on Reddit #45477776

hotpotato87@reddit

The api response delay is so annoying
View on Reddit #45579030

x3derr8orig@reddit

Where is the best place (security and $$ wise) to host it or use it from?
View on Reddit #45573552

dairypharmer@reddit

I’ve been seeing issues in the last few days of requests taking a long time to process. Seems like there’s no published rate limits, but when they get overloaded they’ll just hold your request in a queue for an arbitrary amount of time (I’ve seen order of 10mins). Have not investigated too closely so I’m only 80% sure this is what’s happening. Anyone else?
View on Reddit #45490534

indicava@reddit (OP)

I'm definitely seeing fluctuations in response time for the same amount of input/output tokens. But it's usually around the 50%-100% increase, so a request that takes on average 7-8 seconds sometimes takes 14-15 seconds. But I haven't seen anything more extreme than that.
View on Reddit #45490801

raphaelmansuy@reddit

I face the same issue
View on Reddit #45569142

pacmanpill@reddit

same here with 3 minutes wait for reponse
View on Reddit #45511277

raphaelmansuy@reddit

DeepSeekV3 works incredibly well my ReAct Agentic Framework [https://github.com/quantalogic/quantalogic](https://github.com/quantalogic/quantalogic) https://i.redd.it/zyvu6do2sqce1.gif
View on Reddit #45569101

bannert1337@reddit

Sadly the promotional period will end on February 8, 2025 at 16:00 UTChttps://api-docs.deepseek.com/news/news1226 https://preview.redd.it/bbwk3cdwlqce1.jpeg?width=916&format=pjpg&auto=webp&s=62dc21c2dc0005d44740f94dac18d22d31cea89f
View on Reddit #45567575

indicava@reddit (OP)

True, but it still comes out as x20 cheaper than OpenAI
View on Reddit #45567925

FPham@reddit

This is really great. I mean for my use this would be like $5 for month.
View on Reddit #45557403

AssistBorn4589@reddit

I'm just wondering what part of this is local and why is it upvoted so much.
View on Reddit #45501333

MINIMAN10001@reddit

I assume it's the same reason I get news of new video, audio, and not yet released local models. Because it's interesting enough to share with the community that is primarily based on running their own llama models. It's interesting in this case to see both the sheer number of tokens generated as well as how cheap it was to do so. May also play a part, I had fun with local models because it was free for me as I don't pay for the electricity, thus it was the cheap option so tangentially I find cheap models interesting.
View on Reddit #45557015

ILoveYou_Anyway@reddit

https://preview.redd.it/xmrsekpxpmce1.jpeg?width=474&format=pjpg&auto=webp&s=2d75d5a0532b48844bb5977cf8d6247ea93ce9e0
View on Reddit #45520924

douglasg14b@reddit

This isn't local, why is it here?
View on Reddit #45526073

throwaway1512514@reddit

Can't you run it yourself if you have the compute?
View on Reddit #45548221

douglasg14b@reddit

Yes, but this post isn't about self hosting, it's literally about a cloud service.
View on Reddit #45556142

Captain_Pumpkinhead@reddit

Where do you use DeepSeek V3 at? And what agents are you using?
View on Reddit #45551781

Charuru@reddit

You don’t want to see my o1 bill…
View on Reddit #45477954

thibautrey@reddit

That’s why I went local personally
View on Reddit #45480955

Charuru@reddit

Waiting for r1 to release.
View on Reddit #45482332

TenshiS@reddit

What's r1
View on Reddit #45507803

kellencs@reddit

deepseek thinking model
View on Reddit #45512378

TenshiS@reddit

Interesting. When's it coming? Is there a website?
View on Reddit #45519821

kellencs@reddit

yes, button "deep think" on the deepseek chat
View on Reddit #45550965

ScoreUnique@reddit

Tried the smolthinker? We were told it matches the o1 at math?
View on Reddit #45518054

Charuru@reddit

Dunno maybe if someone shows me some other benchmarks I doubt it’s going to be good
View on Reddit #45547822

mailaai@reddit

You also sell your data
View on Reddit #45477665

Professional_Helper_@reddit

Lol you made me thought I can sell my data to chatgpt and get paid.
View on Reddit #45478434

BoJackHorseMan53@reddit

They already train on all your chatgpt data, even the $200 tier and OpenAI api data and don't pay you anything back.
View on Reddit #45490959

frivolousfidget@reddit

Nonsense You can even be hipaa compliant by request. And default of business accts is gdpr compliant…
View on Reddit #45510532

BoJackHorseMan53@reddit

The $200 Pro tier is not a business account.
View on Reddit #45544677

Professional_Helper_@reddit

Just letting you know that I knew.
View on Reddit #45491014

ticktockbent@reddit

As if the other companies aren't? Anything you type into any model online is being saved and used or sold. If this bothers you, learn to run a local model
View on Reddit #45478973

mailaai@reddit

According to the terms of use and privacy policy, OpenAI and Anthropic don't use the user's API calls to train models. But according to the privacy policy of and terms of use of the Deepseek, they do use the user's API calls to train models. I don't work for any one of these companies. Just wanted to let others know as many developers working with sensitive data. Yes privacy this is what we all agree and are here.
View on Reddit #45522739

ticktockbent@reddit

What about the web interface? This is the way most people interact with these models now
View on Reddit #45523509

mailaai@reddit

ChatGPT: NO, Claude: No, Google: Yes; Deepseek :Yes
View on Reddit #45526200

freecodeio@reddit

If neither are gonna pay me for my data then I couldn't care less whether USA or China or Africa has it.
View on Reddit #45479292

mailaai@reddit

Many organizations need compliance with data protection laws, GDPR, SOC2, HIPAA, and more, knowing that there is training on API calls is important. For instance, in the hospital where my wife works, they have to comply with HIPAA, and they need to know how to make sure that the patients data are safe as this is required by law.
View on Reddit #45525192

freecodeio@reddit

I run a customer service SaaS with ai. Hospitals from the EU configure their own endpoints running gpus from local data centers due to HIPAA, they don't trust openai even though they claim they're compliant.
View on Reddit #45525973

ThaisaGuilford@reddit

Just like OpenAI then.
View on Reddit #45478527

mailaai@reddit

OpenAI does not use your data on API calls.
View on Reddit #45499847

ThaisaGuilford@reddit

Wow that is a huge relief. I trust them 100%.
View on Reddit #45500254

BoJackHorseMan53@reddit

You also sell your data if you use OpenAI API.
View on Reddit #45490817

mailaai@reddit

https://preview.redd.it/2qyw8bizglce1.png?width=1002&format=png&auto=webp&s=9816b6e157b1f2e9c8e59cc14096c97ca6ce213e Not true
View on Reddit #45499593

mailaai@reddit

I am not advocating for OpenAI, neither OpenAI nor Anthropic uses your API call data to train their models. This is not something you'll find in their terms-of-use pages or privacy policies. As LLM devs, you know full well how easily these models can generate training data, and some even say that LLMs only memorizes instead of generalization. Some of this data is deeply personal, like patient diagnoses, financial records, sensitive information that deserve privacy.
View on Reddit #45499064

Apprehensive_Dog1267@reddit

really as openai never use this data or if usa will not try get data of users
View on Reddit #45478136

indicava@reddit (OP)

I'm using DeepSeek V3 for synthetic dataset generation for fine tuning a model on a proprietary programming language. They can use all the data they want, if anything it might hurt their next pretraining lol...
View on Reddit #45477856

franckeinstein24@reddit

This is incredible.
View on Reddit #45523793

Zestyclose_Yak_3174@reddit

Do you use the API directly or through a third party?
View on Reddit #45518628

indicava@reddit (OP)

Directly, it’s OpenAI compatible so I’m actually using the official openai client
View on Reddit #45520505

Zestyclose_Yak_3174@reddit

Thanks for letting me know
View on Reddit #45523657

ihaag@reddit

It’s still not as good a Claude unfortunately… I’ve given it a couple of tests like powershell scripts and asked questions, it still struggles to complete the request as well as Claude does.
View on Reddit #45520441

ESTD3@reddit

How is the API policy regarding privacy? Are your api requests also used for AI training/their own good or is it only when using their free chat option? If anyone knows for certain please let me know. Thanks!
View on Reddit #45517361

indicava@reddit (OP)

It’s been discussed itt quite a lot. Tldr: they are mining me for every token I’m worth.
View on Reddit #45520311

ESTD3@reddit

So double-edged sword then.. depends what you use it for. I see. Thank you!
View on Reddit #45520398

PomegranateSuper8786@reddit

I don’t get it? Why pay?
View on Reddit #45479091

indicava@reddit (OP)

Because for my use case (synthetic dataset generation), I've tested several models and other than gpt-4o or Claude nothing gave me results anywhere close to it's quality (tried Qwen2.5, Llama 3.3, etc.). I do not own the hardware required to run this model locally, and renting out an instance that could run this model on vast.ai/runpod would cost much more (with much worse performance).
View on Reddit #45479437

Many_SuchCases@reddit

>synthetic dataset generation What kind of script are you running for this (if any)?
View on Reddit #45480799

indicava@reddit (OP)

A completely custom python script which is quite elaborate. It grabs data from technical documentation, pairs that with code examples and then sends that entire payload to the API. I have 5 scripts running concurrently with 12 threads per script. It's not even about cost, as far as I can tell, DeepSeek have absolutely no rate limits. I'm hammering their API like there's no tomorrow and not a single request is failing.
View on Reddit #45481201

Miscend@reddit

Have thought of being mindful and not hammering their servers with tons of requests?
View on Reddit #45518582

indicava@reddit (OP)

I promise I’ll be done in a few hours.
View on Reddit #45520361

shing3232@reddit

damn, that why ds start slow down on my friend's game translation.
View on Reddit #45506812

indicava@reddit (OP)

Ha! My bad, tell him the scripts are estimated to finish in about 12 hours lol
View on Reddit #45510409

remedy-tungson@reddit

It's kinda weird, i am currently having issue with DeepSeek. Most of my request failed via Cline and i have to switch between models to do my work :(
View on Reddit #45482980

lizheng2041@reddit

The cline consumes tokens so fast that it easily reaches its 64k context limit
View on Reddit #45492254

indicava@reddit (OP)

I don’t use cline but isn’t there any error code/reason for the request failing. I have to say that for me, stability of this API has been absolutely stellar. Maybe 0.001% failure rate so far.
View on Reddit #45483112

businesskitteh@reddit

You do realize pricing is going way up on Feb 8 right?
View on Reddit #45482905

indicava@reddit (OP)

Yea, of course. AFAIK it’s doubling. Still will be about 20x times cheaper than gpt-4o
View on Reddit #45483017

Many_SuchCases@reddit

That sounds very interesting. I was working on creating a script like that (never finished) and I noticed how quickly the amount of code increased.
View on Reddit #45481892

the320x200@reddit

There's a hidden cost here in that your data is no longer private.
View on Reddit #45498539

indicava@reddit (OP)

I am well aware. I’m not sending it anything that’s I would like to keep private. https://www.reddit.com/r/LocalLLaMA/s/Rf5hX9Mts0
View on Reddit #45498886

frivolousfidget@reddit

That is the main cost here, they are basically buying the data for the price difference. The fact that you are using it for synthetic data gen and nothing private is brilliant.
View on Reddit #45510419

foodwithmyketchup@reddit

I think in a year, perhaps a few, we're going to look back and think "wow that was expensive". Intelligence will be so cheap
View on Reddit #45509306

indicava@reddit (OP)

We’re nearly there, couple (well 3 or 4 actually) of Nvidia Digits and we can run this baby at home!
View on Reddit #45510682

fallingdowndizzyvr@reddit

Slowly though.
View on Reddit #45520015

maddogawl@reddit

I can’t believe how inexpensive it is, although I will say I’ve hit a few api issues, feels like DeepSeek is getting overwhelmed at times.
View on Reddit #45516294

Unusual_Pride_6480@reddit

What do you use it for to use so many tokens?
View on Reddit #45501750

indicava@reddit (OP)

Synthetic dataset generation
View on Reddit #45503559

Unusual_Pride_6480@reddit

Building your own llm or something?
View on Reddit #45511860

indicava@reddit (OP)

Fine tuning an LLM on a proprietary programming language.
View on Reddit #45512186

Unusual_Pride_6480@reddit

Pretty damn cool that is
View on Reddit #45515437

zero_proof_fork@reddit

Might be worth checking out https://github.com/StacklokLabs/promptwright
View on Reddit #45506854

rorowhat@reddit

Is there a Q4 of this model? I've only seen Q2 on LMatudio
View on Reddit #45513815

MarceloTT@reddit

Amazingly, Deepseek will have tons of synthetic data to train their next model. With all this synthetic data, in addition to the treatment that they will probably apply, they will be able to make an even better adjusted version with v3.5 and later create an absurdly better v4 model in 2025.
View on Reddit #45510903

indicava@reddit (OP)

As long as they keep them open and publish papers, I have absolutely no problem with that.
View on Reddit #45512115

CascadeTrident@reddit

Don't you find the small context window frustationing though?
View on Reddit #45506729

indicava@reddit (OP)

I’m currently using it for synthetic dataset generation with no multi-step conversations so it’s not really an issue, each request normally never goes over 4000-5000 tokens.
View on Reddit #45510325

330d@reddit

Not local, did you post this in the wrong sub?
View on Reddit #45508439

Dundell@reddit

I've been using it every chance I can with Cline for 2 major projects and I still can't get past $13 this month.
View on Reddit #45496223

indicava@reddit (OP)

How are you liking its outputs? Especially compared with the frontier models.
View on Reddit #45496933

Dundell@reddit

I seem to have answered out of reply one sec: "For webapps, it's ok. Back end and api building and postgres and basic sqlite can do it itself. Connecting to the frontend has issues and I've called Claude $6 to solve what it can't. Price wise this is amazing for what it can do" Additionally, my issue with Claude is both the price, and the barrier to entry for API. I've only ever spent $10 +$5 free, and the 40k context limit per minute is 1 question.
View on Reddit #45499992

ab2377@reddit

oh dear only only $30 for 270 million tokens!
View on Reddit #45497974

Dundell@reddit

For webapps, it's ok. Back end and api building and postgres and basic sqlite can do it itself. Connecting to the frontend has issues and I've called Claude $6 to solve what it can't. Price wise this is amazing for what it can do
View on Reddit #45497262

NeedsMoreMinerals@reddit

Is this you hosting it somewhere?
View on Reddit #45486781

indicava@reddit (OP)

Hell no, would have to add a couple zeros to the price if that was the case. This is me using their official API (platform.deepseek.com)
View on Reddit #45486885

CloudDevOps007@reddit

Would give it a try!
View on Reddit #45482684