benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
head to head
Claude Opus 4.6
vs
Gemma 4 31B
6 shared benchmarks
4
wins
0
ties
2
wins
100.0%
AIME
89.2%
91.3%
GPQA Diamond
84.3%
53.1%
Humanity's Last Exam
19.5%
76.0%
LiveCodeBench
80.0%
82.0%
MMLU Pro
85.2%
93.0%
MRCR v2
66.4%