DeepSeek v3 is Live!

New DeepSeek v3 Update Shocks The World!

The TLDR
DeepSeek releases V3-0324, a 685-billion parameter open-source model running efficiently at 20 tokens per second on Mac hardware. The MIT-licensed technology approaches Claude 3.7 Sonnet's performance while making cutting-edge AI more accessible.

DeepSeek has once again surprised the AI world! The Chinese startup has quietly released its latest model, DeepSeek-V3-0324, on Hugging Face – and the technical data is impressive.

With 685 billion parameters, the model is positioned as the second-best non-reasoning model in the world, just behind Claude 3.7 Sonnet. Particularly revolutionary: DeepSeek-V3-0324 runs at an impressive 20 tokens per second on a Mac Studio with an M3 Ultra chip – a breakthrough for local AI applications without a server infrastructure.

The mixture-of-experts architecture activates only about 37 of the 685 billion parameters for specific tasks, which explains the remarkable efficiency. Experts suspect that this model forms the basis for the upcoming DeepSeek-R2, a reasoning model that is expected in April. With the MIT license, DeepSeek-V3-0324 is freely available for commercial use – in direct contrast to the closed strategy of Western AI companies. This development could further narrow the gap between Chinese and American AI technology and is a striking example of how the open-source philosophy is changing the AI landscape. DeepSeek's approach democratizes access to cutting-edge technology and could soon fundamentally change the way we think about AI development.

Question of the Day

Is Yann Lecun right that scaling alone is not enough to achieve AGI?

Login or Subscribe to participate in polls.

Chart of the Day

In The News

Reve Image Beats Midjourney, Google

Reve Image dominates the Artificial Analysis Image Arena, outperforming Midjourney, Imagen 3, and FLUX. The model excels in text rendering and prompt adherence.

Alibaba's New 3D Avatar Technology

Alibaba releases TaoAvatar on Hugging Face, creating lifelike full-body talking avatars for augmented reality. The system uses 3D Gaussian Splatting for real-time animation.

Qwen Releases Balanced 32B Vision-Language Model

Qwen releases a 32B vision-language model optimized with reinforcement learning for improved reasoning. The model provides a balance between the lightweight 7B and resource-intensive 72B options.

Quote of the Day

Hi All,

As you probably noticed, we’ve rebranded to Superintelligence! We have brought in a new Editor-in-Chief to bring you even more in depth analysis on all things AI & the future. We are also adding a Chart of the Day, Quote of the Day, and Question of the Day to make your reading experience more fun & interactive. Please feel free to email us with any feedback that you have!

Cheers,
Dan