"The total parameter count is around 1 trillion with 37 billion active per token in a mixture of experts architecture. The context window is reportedly 1 million tokens and it is natively multimodal handling text, image, and video generation in a single model."
Maya
Co-host, Asia Tech Macro
DeepSeek V4