benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
head to head
Qwen3.5 27B
vs
Grok 4
5 shared benchmarks
2
wins
0
ties
3
wins
90.83%
AIME
94.0%
85.5%
GPQA Diamond
88.0%
24.3%
Humanity's Last Exam
24.0%
86.1%
MMLU Pro
87.0%
72.4%
SWE-bench Verified
58.6%