DeepSeek V3 is the gift that keeps on giving!

[-]

freecodeio@reddit

How much would this cost in gpt4o

Reply

[-]

indicava@reddit (OP)

I had ChatGPT do the math for me lol... It estimates around $1,400 USD.

Reply

[-]

lessis_amess@reddit

get something else to do the math, this is wrong lol

Reply

[-]

indicava@reddit (OP)

So for about 180M input tokens and 90M output tokens, what did your calculation come to?

Reply

[-]

obviously you are doing a ton of cache hits to pay 30usd for this amount of tokens. why are you assuming you would not hit that with oai? The simple heuristic is that at its most expensive, deepseek is 40x cheaper for output (10x cheaper for input)

Reply

[-]

indicava@reddit (OP)

the DeepSeek console doesn't provide a simple way to test this. But looking at one day, I'm about at 50% cache hits. https://preview.redd.it/niycmr7makce1.png?width=558&format=png&auto=webp&s=a2792b6302312b8dc653e4e1a6643f49e1573705

Reply

[-]

Quiet_Debate_651@reddit

How do you do to have so much cache hit?

Reply

[-]

SynthSire@reddit

The export to .csv contains it as a breakdown, and allows you to use formulas to see the exact cost breakdown. After seeing this post I have given it a go for dataset generation and am very happy with its output at a cost of $8.41 for what gtp4o for similar output would cost $293.75

Reply

[-]

dubesor86@reddit

Seems about right. This aligns with my cost effectiveness calculations https://dubesor.de/benchtable#cost-effectiveness It depends how long your context carry over is, but either way 4o would be vastly more expensive. Even in best case scenario for 4o, it would be at least 40x more expensive.

Reply

[-]

RageshAntony@reddit

What's that "Minimum Performance:" slider?

Reply

[-]

dp3471@reddit

awesome site by the way!

Reply

[-]

indicava@reddit (OP)

Very cool data and layout! Thanks for sharing.

Reply

[-]

freecodeio@reddit

Is this all input tokens or how are they split? Cause with real math it's somewhere between $682 - $2730

Reply

[-]

indicava@reddit (OP)

the DeepSeek console doesn't provide an easy breakdown for this. But I'm estimating about a 2/3 to 1/3 split of Input vs Output tokens.

Reply

[-]

Mickenfox@reddit

Yeah but now compare it to gemini-2.0-flash-exp (just don't look at the rate limits)

Reply

[-]

indicava@reddit (OP)

The latest crop of Gemini models are seriously impressive (exp-1206, 2.0 flash, 2.0 flash thinking). But like your comment alluded to, the rate limits are a joke. For my use case they weren’t even an option. Hopefully when they become “GA” google will ease up on the limits because I really think they have a ton of potential.

Reply

[-]

Alexs1200AD@reddit

2000 Is this a small number of requests per minute?

Reply

[-]

AppearanceHeavy6724@reddit

Not for prose. they suck at fiction, esp 1206. Mistral is far better.

Reply

[-]

cgcmake@reddit

What does GA mean?

Reply

[-]

indicava@reddit (OP)

lol I’m a software guy, GA usually means “Generally Available”. I have no idea if that’s the best term for what I meant, which is: when they leave their “experimental” stage.

Reply

[-]

raiffuvar@reddit

what limits?

Reply

[-]

Mickenfox@reddit

The limit through the API is 10 requests per minute.

Reply

[-]

RegisteredJustToSay@reddit

You mean if you use the free one? Gemini model APIs advertise 1000-4000 requests per minute for pay-as-you-go depending on the model and I've never hit limits, but I'm not sure if there's some hidden limit you're alluding to which I've somehow narrowly avoided. I'm just not sure we should be comparing paid api limits with free ones.

Reply

[-]

raiffuvar@reddit

oh.. probably indians can handle just that much.

Reply

[-]

A_Dragon@reddit

How does v3 compare to o1?

Reply

[-]

torama@reddit

IMHO it compares on equal footing to sonnet or o1 for coding BUT it lacks in context window severly. So if your task is short it is wonderful. But if I give it a few thousand lines of context code it looses its edge

Reply

[-]

freecodeio@reddit

what model doesn't lose its edge with long 65k+ token prompts

Reply

[-]

Zeitgeist75@reddit

Sonnet 3.5 has been quite good at answering complex questions with entire books as context for me so far.

Reply

[-]

Few_Painter_5588@reddit

Google Gemini

Reply

[-]

BoJackHorseMan53@reddit

Deepseek has 128k context, same as gpt-4o

Reply

[-]

OrangeESP32x99@reddit

It’s currently limited to half that unless you’re running local.

Reply

[-]

BoJackHorseMan53@reddit

Or using fireworks or together API :)

Reply

[-]

OrangeESP32x99@reddit

Yeah I just meant official app and api has the limit. I assume it’ll be gone when they raise the prices.

Reply

[-]

torama@reddit

I am using a web interface for testing it and I think that interface has limited context but not sure

Reply

[-]

A_Dragon@reddit

I meant with coding.

Reply

[-]

CleanThroughMyJorts@reddit

I've been running a few agent experiments with Cline, giving simple dev tasks to o1, sonnet 3.5, Deepseek, and gemini. If I were to rank them based on how well they did: (best) Claude -> o1 -> Deepseek -> Gemini (worst) Here's a cost breakdown of 1 of the tasks that they did: Basically they had to setup a dev environmnent, read the docs on a few tools (they are new or obscure so outside training data; by default asking LLMs to use those tools they either use the old API or hallucinate things) and create a basic workflow connecting the three tools and write tests to ensure they work. 1. **Claude 3.5 Sonnet** * First to complete * Tokens: 206.4k * Cost: $0.1814 * Most efficient successful run * Notable for handling missing .env autonomously 2. **OpenAI O1-Preview** * Second to complete * Tokens: 531.3k * Cost: $11.3322 * Highest cost but clean execution 3. **DeepSeek v3** * Third to complete * Tokens: 1.3M * Cost: $0.7967 * Higher token usage but cost remained reasonable due to lower pricing 4. **Gemini-exp-1206** * DNF * Tokens: 2.2M * Multiple hints needed * Status: Terminated without completing setup Of the 3 that succeeded, deepseek had the most trouble; it needed several tries, kept making mistakes and not understanding what its mistakes were. o1 and Claude were better at self-correcting when they got things wrong. Note: cost numbers are from usage via openrouter, not their respective official apis

Reply

[-]

Nervous-Positive-431@reddit

May I ask, how many requests per day does that translates to? I am kind of a newbie here! Also, will the previous conversation/context be added into the total used tokens? Or it is generally used with a single fully detailed request without forwarding the past conversation?

Reply

[-]

Aware_Sympathy_1652@reddit

Asking it to summarize quantum mechanics cost 250 tokens

Reply

[-]

Utoko@reddit

many many many. The only way you get to these numbers is with Agents. Most likely big code projects. Request is not a great measurement. Normal short questions are 500 Token. A request in your codebase can take 100K Tokens.

Reply

[-]

pol_phil@reddit

Only way is with Agents? 😛 With such low prices I was thinking of building synthetic data based on whole corpora!

Reply

[-]

WeWantTheFunk73@reddit

What formula do you use to estimate number of words based on tokens?

Reply

[-]

pol_phil@reddit

Well, there is not a single golden formula. OpenAI tells you that "1 word = 1.25 tokens" which is more or less true for common English texts. But, depending on the model's tokenizer, how specialized a domain is, or for other languages, 1 word can amount to anything between 1.5-7 tokens.

Reply

[-]

frivolousfidget@reddit

How do you go to generatw synthetic data? Any prompts or software for that?

Reply

[-]

BattleRepulsiveO@reddit

you can automate over real data and ask the AI to summarize or format it in a better way. For example, there are tv scripts online which you can ask the AI to turn the script into a summary.

Reply

[-]

-Django@reddit

It's highly task dependent, but you generally give an LLM your labels/label distribution and task it with creating the input data. e.g. if you're making an NLP hospital readmission model, you'd find the prevalence of the event from literature, let's say its 10%, then you'd task the model to generate 900 notes for patients that WONT be readmitted and 100 notes where the patient WILL be readmitted.

Reply

[-]

59808@reddit

Out of interest - which agents can handle that kind of amounts of tokens?

Reply

[-]

l33t-Mt@reddit

It might not just be one.

Reply

[-]

Nervous-Positive-431@reddit

Wow...that is dirt cheap. Appreciated mate!

Reply

[-]

1ncehost@reddit

When I use dir-assistant, it sends an entire context worth of a code repo to the LLM for every request. If I use Deepseek v3 (128k context size) and make a query every 5 minutes, that's over 10 million tokens per day.

Reply

[-]

gooeydumpling@reddit

If you’re coding heavily then you could easily clear that number, even without agents. Cline for example, if you make it do stuff in vscode, can spend 1M tokens in literally minutes

Reply

[-]

Stellar3227@reddit

I tried to give a better estimate than the first reply but they're right: it's so many and really to answer, lol. I estimated 100k tokens MAX per day when I'm using an AI all day. To each 274 million, that'd be 2,740 *days!* I.e. 7.5 years of daily heavy use. However, that number would be reached much faster with long context, like uploading and discussing books. So it really depends.

Reply

[-]

Pvt_Twinkietoes@reddit

It is about 10mil tokens per day. 128k maximum window size. That means minimum 78 requests per day. Not sure what OP uses it for, but it is ALOT.

Reply

[-]

Substantial-Thing303@reddit

Do you guys still see a difference between Deepseek v3 from OpenRouter and directly through their API? I only use OpenRouter, and V3 is always making garbage code. Super messy, no good understanding of subclasses, unmaintainable code, etc. Past 10k tokens it ignores way too much code and only works ok if I give it less than 4k tokens, but still inferior to Sonnet. Sonnet 3.5 feels 10x better while working with my codebase.

Reply

[-]

AriyaSavaka@reddit

Probably because they're using a low quant on their cluster. DeepSeek on official API works great for me.

Reply

[-]

mycall@reddit

Does DeepSeek analyze and harvest the tokens the chat completions contexts? They might get some juicy data for next-gen use cases (or future training).

Reply

[-]

BoJackHorseMan53@reddit

OpenAI does for sure.

Reply

[-]

BGFlyingToaster@reddit

Not if you use it inside of Azure OpenAI Services

Reply

[-]

amdcoc@reddit

Then azure owner gets it.

Reply

[-]

BGFlyingToaster@reddit

That would be you

Reply

[-]

BoJackHorseMan53@reddit

Same with Deepseek, if you run it locally or host on Azure ;)

Reply

[-]

mrjackspade@reddit

Because if OpenAI does it, that makes it okay.

Reply

[-]

BoJackHorseMan53@reddit

I don't see you complaining about data harvesting when OpenAI releases a new model.

Reply

[-]

indicava@reddit (OP)

afaik their ToS state they use customer data for training future models.

Reply

[-]

RageshAntony@reddit

What's the limit for DeepSeek V3 free chat ?

Reply

[-]

dairypharmer@reddit

Correct. Their hosted chat bot is even worse, they claim ownership over all outputs.

Reply

[-]

raiffuvar@reddit

Every model claims ownership of output. And restrict from training other models with this output.

Reply

[-]

lolzinventor@reddit

Am i doing it right? https://preview.redd.it/tsqv8ydizjce1.png?width=1224&format=png&auto=webp&s=4121a4d9c4db55f53a3a5d70952bfbd400a09bce

Reply

[-]

Many_SuchCases@reddit

🔎🧐 it appears you have the day off from work/school every Wednesday.

Reply

[-]

lolzinventor@reddit

Not sure, it could be those days i leave the syngen processes undisturbed, allowing them to get on with processing tokens. ive lowered the thread count recently.

Reply

[-]

Enough-Meringue4745@reddit

What is this syngen

Reply

[-]

MatlowAI@reddit

Synthetic dataset creation?

Reply

[-]

lolzinventor@reddit

yeah.

Reply

[-]

-Django@reddit

What kind of task are you making the dataset for? just curious and interested in learning about synthetic data :-)

Reply

[-]

lolzinventor@reddit

Attempting to make the LLM reason.

Reply

[-]

MatlowAI@reddit

Speaking of synthetic data creation... Something I'd love to see is if we can steer reasoning into scientific logical leaps... creating training data sets for things like I shorted out a battery and it sparked and glowed red, gas lamps glow too, they are crummy because x, I wonder if this can replace gas lamps and then scenarios on observation and hypothesis and experimental design all the way down the tech tree for power requirments, failure modes, oxidation fix, thermal runaway fix, etc until we get to tungsten filament in a vacuum chamber... for various different inventions. Any thoughts on tips for how to generate quality synthetic data here given enough good examples manually created? They tend to not be able to think of these connections from my cursory look at it and I'd hate to have to manually do this.

Reply

[-]

lolzinventor@reddit

They are really bad at this kind of thing, as well as fault finding and deduction.

Reply

[-]

poetic_fartist@reddit

What do you do sir for a living and can I start learning and experimenting with llms on 3070 laptop ?

Reply

[-]

Many_SuchCases@reddit

I see. My usage spikes on Friday, apparently. I wonder if there are days where inference is faster due to different amounts of concurrent users.

Reply

[-]

superfsm@reddit

I noticed this, yes.

Reply

[-]

Yes_but_I_think@reddit

Don't do this. Please. Let the needy use this. Go for O1. I think you can.

Reply

[-]

Down_The_Rabbithole@reddit

Very curious about the datasets you're creating.

Reply

[-]

lolzinventor@reddit

just learning, probably mostly wasted effort and tokens.

Reply

[-]

Mediocre_Tree_5690@reddit

What kind of synthetic data sets are you creating and what do you use them for?

Reply

[-]

FriskyFennecFox@reddit

That's a huge amount of requests. Coding?

Reply

[-]

lolzinventor@reddit

dataset generation.

Reply

[-]

indicava@reddit (OP)

Hell yea, yo go brother!

Reply

[-]

hotpotato87@reddit

The api response delay is so annoying

Reply

[-]

x3derr8orig@reddit

Where is the best place (security and $$ wise) to host it or use it from?

Reply

[-]

dairypharmer@reddit

I’ve been seeing issues in the last few days of requests taking a long time to process. Seems like there’s no published rate limits, but when they get overloaded they’ll just hold your request in a queue for an arbitrary amount of time (I’ve seen order of 10mins). Have not investigated too closely so I’m only 80% sure this is what’s happening. Anyone else?

Reply

[-]

indicava@reddit (OP)

I'm definitely seeing fluctuations in response time for the same amount of input/output tokens. But it's usually around the 50%-100% increase, so a request that takes on average 7-8 seconds sometimes takes 14-15 seconds. But I haven't seen anything more extreme than that.

Reply

[-]

raphaelmansuy@reddit

I face the same issue

Reply

[-]

pacmanpill@reddit

same here with 3 minutes wait for reponse

Reply

[-]

raphaelmansuy@reddit

DeepSeekV3 works incredibly well my ReAct Agentic Framework [https://github.com/quantalogic/quantalogic](https://github.com/quantalogic/quantalogic) https://i.redd.it/zyvu6do2sqce1.gif

Reply

[-]

bannert1337@reddit

Sadly the promotional period will end on February 8, 2025 at 16:00 UTChttps://api-docs.deepseek.com/news/news1226 https://preview.redd.it/bbwk3cdwlqce1.jpeg?width=916&format=pjpg&auto=webp&s=62dc21c2dc0005d44740f94dac18d22d31cea89f

Reply

[-]

indicava@reddit (OP)

True, but it still comes out as x20 cheaper than OpenAI

Reply

[-]

FPham@reddit

This is really great. I mean for my use this would be like $5 for month.

Reply

[-]

AssistBorn4589@reddit

I'm just wondering what part of this is local and why is it upvoted so much.

Reply

[-]

MINIMAN10001@reddit

I assume it's the same reason I get news of new video, audio, and not yet released local models. Because it's interesting enough to share with the community that is primarily based on running their own llama models. It's interesting in this case to see both the sheer number of tokens generated as well as how cheap it was to do so. May also play a part, I had fun with local models because it was free for me as I don't pay for the electricity, thus it was the cheap option so tangentially I find cheap models interesting.

Reply

[-]

ILoveYou_Anyway@reddit

https://preview.redd.it/xmrsekpxpmce1.jpeg?width=474&format=pjpg&auto=webp&s=2d75d5a0532b48844bb5977cf8d6247ea93ce9e0

Reply

[-]

douglasg14b@reddit

This isn't local, why is it here?

Reply

[-]

throwaway1512514@reddit

Can't you run it yourself if you have the compute?

Reply

[-]

douglasg14b@reddit

Yes, but this post isn't about self hosting, it's literally about a cloud service.

Reply

[-]

Captain_Pumpkinhead@reddit

Where do you use DeepSeek V3 at? And what agents are you using?

Reply

[-]

Charuru@reddit

You don’t want to see my o1 bill…

Reply

[-]

thibautrey@reddit

That’s why I went local personally

Reply

[-]

Charuru@reddit

Waiting for r1 to release.

Reply

[-]

TenshiS@reddit

What's r1

Reply

[-]

kellencs@reddit

deepseek thinking model

Reply

[-]

TenshiS@reddit

Interesting. When's it coming? Is there a website?

Reply

[-]

kellencs@reddit

yes, button "deep think" on the deepseek chat

Reply

[-]

ScoreUnique@reddit

Tried the smolthinker? We were told it matches the o1 at math?

Reply

[-]

Charuru@reddit

Dunno maybe if someone shows me some other benchmarks I doubt it’s going to be good

Reply

[-]

mailaai@reddit

You also sell your data

Reply

[-]

Professional_Helper_@reddit

Lol you made me thought I can sell my data to chatgpt and get paid.

Reply

[-]

BoJackHorseMan53@reddit

They already train on all your chatgpt data, even the $200 tier and OpenAI api data and don't pay you anything back.

Reply

[-]

frivolousfidget@reddit

Nonsense You can even be hipaa compliant by request. And default of business accts is gdpr compliant…

Reply

[-]

BoJackHorseMan53@reddit

The $200 Pro tier is not a business account.

Reply

[-]

Professional_Helper_@reddit

Just letting you know that I knew.

Reply

[-]

ticktockbent@reddit

As if the other companies aren't? Anything you type into any model online is being saved and used or sold. If this bothers you, learn to run a local model

Reply

[-]

mailaai@reddit

According to the terms of use and privacy policy, OpenAI and Anthropic don't use the user's API calls to train models. But according to the privacy policy of and terms of use of the Deepseek, they do use the user's API calls to train models. I don't work for any one of these companies. Just wanted to let others know as many developers working with sensitive data. Yes privacy this is what we all agree and are here.

Reply

[-]

ticktockbent@reddit

What about the web interface? This is the way most people interact with these models now

Reply

[-]

mailaai@reddit

ChatGPT: NO, Claude: No, Google: Yes; Deepseek :Yes

Reply

[-]

freecodeio@reddit

If neither are gonna pay me for my data then I couldn't care less whether USA or China or Africa has it.

Reply

[-]

mailaai@reddit

Many organizations need compliance with data protection laws, GDPR, SOC2, HIPAA, and more, knowing that there is training on API calls is important. For instance, in the hospital where my wife works, they have to comply with HIPAA, and they need to know how to make sure that the patients data are safe as this is required by law.

Reply

[-]

freecodeio@reddit

I run a customer service SaaS with ai. Hospitals from the EU configure their own endpoints running gpus from local data centers due to HIPAA, they don't trust openai even though they claim they're compliant.

Reply

[-]

ThaisaGuilford@reddit

Just like OpenAI then.

Reply

[-]

mailaai@reddit

OpenAI does not use your data on API calls.

Reply

[-]

ThaisaGuilford@reddit

Wow that is a huge relief. I trust them 100%.

Reply

[-]

BoJackHorseMan53@reddit

You also sell your data if you use OpenAI API.

Reply

[-]

mailaai@reddit

https://preview.redd.it/2qyw8bizglce1.png?width=1002&format=png&auto=webp&s=9816b6e157b1f2e9c8e59cc14096c97ca6ce213e Not true

Reply

[-]

mailaai@reddit

I am not advocating for OpenAI, neither OpenAI nor Anthropic uses your API call data to train their models. This is not something you'll find in their terms-of-use pages or privacy policies. As LLM devs, you know full well how easily these models can generate training data, and some even say that LLMs only memorizes instead of generalization. Some of this data is deeply personal, like patient diagnoses, financial records, sensitive information that deserve privacy.

Reply

[-]

Apprehensive_Dog1267@reddit

really as openai never use this data or if usa will not try get data of users

Reply

[-]

indicava@reddit (OP)

I'm using DeepSeek V3 for synthetic dataset generation for fine tuning a model on a proprietary programming language. They can use all the data they want, if anything it might hurt their next pretraining lol...

Reply

[-]

franckeinstein24@reddit

This is incredible.

Reply

[-]

Zestyclose_Yak_3174@reddit

Do you use the API directly or through a third party?

Reply

[-]

indicava@reddit (OP)

Directly, it’s OpenAI compatible so I’m actually using the official openai client

Reply

[-]

Zestyclose_Yak_3174@reddit

Thanks for letting me know

Reply

[-]

ihaag@reddit

It’s still not as good a Claude unfortunately… I’ve given it a couple of tests like powershell scripts and asked questions, it still struggles to complete the request as well as Claude does.

Reply

[-]

ESTD3@reddit

How is the API policy regarding privacy? Are your api requests also used for AI training/their own good or is it only when using their free chat option? If anyone knows for certain please let me know. Thanks!

Reply

[-]

indicava@reddit (OP)

It’s been discussed itt quite a lot. Tldr: they are mining me for every token I’m worth.

Reply

[-]

ESTD3@reddit

So double-edged sword then.. depends what you use it for. I see. Thank you!

Reply

[-]

PomegranateSuper8786@reddit

I don’t get it? Why pay?

Reply

[-]

indicava@reddit (OP)

Because for my use case (synthetic dataset generation), I've tested several models and other than gpt-4o or Claude nothing gave me results anywhere close to it's quality (tried Qwen2.5, Llama 3.3, etc.). I do not own the hardware required to run this model locally, and renting out an instance that could run this model on vast.ai/runpod would cost much more (with much worse performance).

Reply

[-]

Many_SuchCases@reddit

>synthetic dataset generation What kind of script are you running for this (if any)?

Reply

[-]

indicava@reddit (OP)

A completely custom python script which is quite elaborate. It grabs data from technical documentation, pairs that with code examples and then sends that entire payload to the API. I have 5 scripts running concurrently with 12 threads per script. It's not even about cost, as far as I can tell, DeepSeek have absolutely no rate limits. I'm hammering their API like there's no tomorrow and not a single request is failing.

Reply

[-]

Miscend@reddit

Have thought of being mindful and not hammering their servers with tons of requests?

Reply

[-]

indicava@reddit (OP)

I promise I’ll be done in a few hours.

Reply

[-]

shing3232@reddit

damn, that why ds start slow down on my friend's game translation.

Reply

[-]

indicava@reddit (OP)

Ha! My bad, tell him the scripts are estimated to finish in about 12 hours lol

Reply

[-]

remedy-tungson@reddit

It's kinda weird, i am currently having issue with DeepSeek. Most of my request failed via Cline and i have to switch between models to do my work :(

Reply

[-]

lizheng2041@reddit

The cline consumes tokens so fast that it easily reaches its 64k context limit

Reply

[-]

indicava@reddit (OP)

I don’t use cline but isn’t there any error code/reason for the request failing. I have to say that for me, stability of this API has been absolutely stellar. Maybe 0.001% failure rate so far.

Reply

[-]

businesskitteh@reddit

You do realize pricing is going way up on Feb 8 right?

Reply

[-]

indicava@reddit (OP)

Yea, of course. AFAIK it’s doubling. Still will be about 20x times cheaper than gpt-4o

Reply

[-]

Many_SuchCases@reddit

That sounds very interesting. I was working on creating a script like that (never finished) and I noticed how quickly the amount of code increased.

Reply

[-]

the320x200@reddit

There's a hidden cost here in that your data is no longer private.

Reply

[-]

indicava@reddit (OP)

I am well aware. I’m not sending it anything that’s I would like to keep private. https://www.reddit.com/r/LocalLLaMA/s/Rf5hX9Mts0

Reply

[-]

frivolousfidget@reddit

That is the main cost here, they are basically buying the data for the price difference. The fact that you are using it for synthetic data gen and nothing private is brilliant.

Reply

[-]

foodwithmyketchup@reddit

I think in a year, perhaps a few, we're going to look back and think "wow that was expensive". Intelligence will be so cheap

Reply

[-]

indicava@reddit (OP)

We’re nearly there, couple (well 3 or 4 actually) of Nvidia Digits and we can run this baby at home!

Reply

[-]

fallingdowndizzyvr@reddit

Slowly though.

Reply

[-]

maddogawl@reddit

I can’t believe how inexpensive it is, although I will say I’ve hit a few api issues, feels like DeepSeek is getting overwhelmed at times.

Reply

[-]

Unusual_Pride_6480@reddit

What do you use it for to use so many tokens?

Reply

[-]

indicava@reddit (OP)

Synthetic dataset generation

Reply

[-]

Unusual_Pride_6480@reddit

Building your own llm or something?

Reply

[-]

indicava@reddit (OP)

Fine tuning an LLM on a proprietary programming language.

Reply

[-]

Unusual_Pride_6480@reddit

Pretty damn cool that is

Reply

[-]

zero_proof_fork@reddit

Might be worth checking out https://github.com/StacklokLabs/promptwright

Reply

[-]

rorowhat@reddit

Is there a Q4 of this model? I've only seen Q2 on LMatudio

Reply

[-]

MarceloTT@reddit

Amazingly, Deepseek will have tons of synthetic data to train their next model. With all this synthetic data, in addition to the treatment that they will probably apply, they will be able to make an even better adjusted version with v3.5 and later create an absurdly better v4 model in 2025.

Reply

[-]

indicava@reddit (OP)

As long as they keep them open and publish papers, I have absolutely no problem with that.

Reply

[-]

CascadeTrident@reddit

Don't you find the small context window frustationing though?

Reply

[-]

indicava@reddit (OP)

I’m currently using it for synthetic dataset generation with no multi-step conversations so it’s not really an issue, each request normally never goes over 4000-5000 tokens.

Reply

[-]

330d@reddit

Not local, did you post this in the wrong sub?

Reply

[-]

Dundell@reddit

I've been using it every chance I can with Cline for 2 major projects and I still can't get past $13 this month.

Reply

[-]

indicava@reddit (OP)

How are you liking its outputs? Especially compared with the frontier models.

Reply

[-]

Dundell@reddit

I seem to have answered out of reply one sec: "For webapps, it's ok. Back end and api building and postgres and basic sqlite can do it itself. Connecting to the frontend has issues and I've called Claude $6 to solve what it can't. Price wise this is amazing for what it can do" Additionally, my issue with Claude is both the price, and the barrier to entry for API. I've only ever spent $10 +$5 free, and the 40k context limit per minute is 1 question.

Reply

[-]

ab2377@reddit

oh dear only only $30 for 270 million tokens!

Reply

[-]

Dundell@reddit

For webapps, it's ok. Back end and api building and postgres and basic sqlite can do it itself. Connecting to the frontend has issues and I've called Claude $6 to solve what it can't. Price wise this is amazing for what it can do

Reply

[-]

NeedsMoreMinerals@reddit

Is this you hosting it somewhere?

Reply

[-]

indicava@reddit (OP)

Hell no, would have to add a couple zeros to the price if that was the case. This is me using their official API (platform.deepseek.com)

Reply to Post

182 Comments