benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
Anthropic
Claude Sonnet 4.6
7 benchmarks
AIME
#20 of 39
94.0%
GPQA Diamond
#11 of 49
89.9%
SWE-bench Verified
#8 of 40
79.6%
MMLU Pro
#21 of 29
79.2%
BrowseComp
#11 of 17
74.0%
OSWorld
#6 of 16
72.5%
Terminal-Bench 2.0
#9 of 14
59.1%