benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
OpenAI
GPT-5.4 Mini
4 benchmarks
GPQA Diamond
#16 of 49
88.0%
OSWorld
#7 of 16
72.1%
Terminal-Bench 2.0
#8 of 14
60.0%
SWE-bench Pro
#6 of 13
54.4%