DeepSeek V4 Flash
12 benchmarks
AIME
#19 of 108
94.8%
LiveCodeBench
#4 of 51
91.6%
GPQA Diamond
#21 of 101
88.1%
MMLU Pro
#11 of 53
86.2%
SWE-bench Verified
#13 of 79
79.0%
MRCR v2
#5 of 13
78.7%
SWE-bench Multilingual
#8 of 15
73.3%
BrowseComp
#20 of 33
73.2%
Terminal-Bench 2.0
#21 of 28
56.9%
SWE-bench Pro
#17 of 27
52.6%
Humanity's Last Exam
#29 of 50
34.8%
SimpleQA Verified
#3 of 3
34.1%