DeepSeek V4 Open-Source MoE

Key Questions

What are the main specifications of DeepSeek V4?

DeepSeek V4 is an open-source Mixture-of-Experts model with 1.6 trillion parameters, 49 billion active parameters, and crushes benchmarks against Claude. It features a 90% reduction in KV cache usage and supports 1 million token context. The model follows DeepSeek's R1 'Sputnik' with low-cost scaling.

Where can DeepSeek V4 be accessed?

DeepSeek V4 is now live on Hugging Face, supported by Novita. It includes previews for Huawei Ascend tuning, with full launch pending. This boosts open-source AI accessibility amid hardware challenges.

What efficiency improvements does DeepSeek V4 offer?

DeepSeek V4 introduces training efficiency hacks and highly efficient million-token context intelligence, reducing KV cache by 90%. It leverages Nvidia and Huawei hardware despite bans. These advancements challenge proprietary models in performance and cost.

DeepSeek open-sourced V4 series: 1.6T params (49B active), crushes Claude benchmarks, 90% KV cache reduction, training efficiency hacks; debuts Huawei Ascend tuning preview, full launch pending. Follows R1 'Sputnik', low-cost scaling via Nvidia/Huawei amid bans, boosts open-source challenge.

Sources (2)

Updated Apr 27, 2026

AI Breakthrough Tracker

DeepSeek V4 Open-Source MoE

Key Questions

What are the main specifications of DeepSeek V4?

Where can DeepSeek V4 be accessed?

What efficiency improvements does DeepSeek V4 offer?

@julien_c reposted: 🤗 DeepSeek V4 is now live on @huggingface — supported by Novita 1M context. Mas...

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence