Grok 4
6 benchmarks
AIME
#21 of 39
94.0%
GPQA Diamond
#15 of 49
88.0%
MMLU Pro
#7 of 29
87.0%
Aider Polyglot
#4 of 7
79.6%
SWE-bench Verified
#39 of 40
58.6%
Humanity's Last Exam
#21 of 24
24.0%