DeepSeek V4: 1T-A35B (approx) MoE announced; apache 2 license promised

Posted by Live-Crab3086@reddit | LocalLLaMA | View on Reddit | 21 comments

i'm going to need more ram

https://deepseek.ai/deepseek-v4

[-]

Impossible_Ground_15@reddit

it says "•Runnable locally on dual RTX 4090s or single RTX 5090" if this is true it would be amazing

[-]

mindwip@reddit

Was this an written by Ai?

1T model on 48gb? Lol

Even the rumored lite version at 200b is pushing it for 48gb

[-]

They might surprise us with a new architecture. They said the model is supposed to work efficiently on the RTX 5090, but don’t expect the full version in FB8 format. I don’t think so; maybe in a lower format. However, they mentioned it would work on the RTX 4090, so don’t worry.

[-]

droptableadventures@reddit

Previous DeepSeek was runnable (at ~5T/sec) with the important bits in GPU and the rest in RAM. But you needed ~256GB of RAM.

[-]

LagOps91@reddit

yeah and it will be about the same if you can have 512gb of ram this time around (assuming it's actually 1t parameters)... pretty steep requriements

[-]

droptableadventures@reddit

It'll be similar to what it takes to run Kimi K2.5 now.

[-]

LagOps91@reddit

200b isn't running on 48gb. 48gb targets dense 70b models at q4 for the most part...

[-]

nuclearbananana@reddit

I mean yes. It's a fake clickbait site

[-]

chibop1@reddit

"Runnable locally on dual RTX 4090s or single RTX 5090"

Just curious, how does 1T fits in single 5090?

Does the entire weights get loaded to ram, and just MoE run on vram?

Would that slow down significantly because you have to keep swapping MoE between vram and ram?

[-]

LagOps91@reddit

in general you put the attention and kv cache + shared experts on vram, a 5090 would be fine for that. the routed experts are kept in ram. there is no passing around weights, the calculation for routed experts in done on cpu. so you would need like 512gb ram for Q4. if you do have a server-board with 8 or 12 channel ram, you should get decent speed with it, but needless to say such a setup is quite pricey (easily 10k+ even before the ram price hikes) and the speed you get in return isn't all that impressive.

that aside the website isn't legit, so don't expect any actual information to be sensible from the site.

[-]