benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
head to head
Claude Opus 4.6
vs
DeepSeek V4 Pro
11 shared benchmarks
7
wins
0
ties
4
wins
100.0%
AIME
95.2%
84.0%
BrowseComp
83.4%
91.3%
GPQA Diamond
90.1%
40.0%
Humanity's Last Exam
37.7%
76.0%
LiveCodeBench
93.5%
82.0%
MMLU Pro
87.5%
93.0%
MRCR v2
83.5%
77.8%
SWE-bench Multilingual
76.2%
53.4%
SWE-bench Pro
55.4%
81.42%
SWE-bench Verified
80.6%
65.4%
Terminal-Bench 2.0
67.9%