DeepSeek V4: 1T-A35B (approx) MoE announced; apache 2 license promised
Posted by Live-Crab3086@reddit | LocalLLaMA | View on Reddit | 21 comments
i'm going to need more ram
LocalLLaMA-ModTeam@reddit
This post has been marked as spam.
Impossible_Ground_15@reddit
it says "•Runnable locally on dual RTX 4090s or single RTX 5090" if this is true it would be amazing
mindwip@reddit
Was this an written by Ai?
1T model on 48gb? Lol
Even the rumored lite version at 200b is pushing it for 48gb
Lost_Lie1902@reddit
They might surprise us with a new architecture. They said the model is supposed to work efficiently on the RTX 5090, but don’t expect the full version in FB8 format. I don’t think so; maybe in a lower format. However, they mentioned it would work on the RTX 4090, so don’t worry.
droptableadventures@reddit
Previous DeepSeek was runnable (at ~5T/sec) with the important bits in GPU and the rest in RAM. But you needed ~256GB of RAM.
LagOps91@reddit
yeah and it will be about the same if you can have 512gb of ram this time around (assuming it's actually 1t parameters)... pretty steep requriements
droptableadventures@reddit
It'll be similar to what it takes to run Kimi K2.5 now.
LagOps91@reddit
200b isn't running on 48gb. 48gb targets dense 70b models at q4 for the most part...
nuclearbananana@reddit
I mean yes. It's a fake clickbait site
chibop1@reddit
"Runnable locally on dual RTX 4090s or single RTX 5090"
Just curious, how does 1T fits in single 5090?
Does the entire weights get loaded to ram, and just MoE run on vram?
Would that slow down significantly because you have to keep swapping MoE between vram and ram?
LagOps91@reddit
in general you put the attention and kv cache + shared experts on vram, a 5090 would be fine for that. the routed experts are kept in ram. there is no passing around weights, the calculation for routed experts in done on cpu. so you would need like 512gb ram for Q4. if you do have a server-board with 8 or 12 channel ram, you should get decent speed with it, but needless to say such a setup is quite pricey (easily 10k+ even before the ram price hikes) and the speed you get in return isn't all that impressive.
that aside the website isn't legit, so don't expect any actual information to be sensible from the site.
Adventurous_Doubt_70@reddit
It just offload the activated expert weights to the 5090 on demand, not the entire weights.
LagOps91@reddit
that's not how this works
r4in311@reddit
"Deepseek.ai is an independent website and is not affiliated with, sponsored by, or endorsed by Hangzhou DeepSeek Artificial Intelligence Co., Ltd."
SandboChang@reddit
Yeah I was first surprised by the number of invasive ads at first. Then I saw the benchmark and all they gave were approximations.
This is complete bullshit.
DragonfruitIll660@reddit
Had me excited for 30 seconds lol
Different_Fix_2217@reddit
Fake website.
EastZealousideal7352@reddit
“DeepSeek.ai is not affiliated with, endorsed by, or connected to DeepSeek.com in any way.”
I don’t see anything from official DeepSeek sources
baseketball@reddit
This is not the actual deepseek site. It's a fake AI news.
Ok-Mess-3317@reddit
“Runnable locally on dual RTX 4090s or single RTX 5090” you mean, with uhh, a TB of RAM?
Ok-Mess-3317@reddit
also the article generally looks like slop