benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
head to head
GPT-5.3 Codex
vs
GPT-5.4 Mini
4 shared benchmarks
3
wins
0
ties
1
wins
91.5%
GPQA Diamond
88.0%
64.0%
OSWorld
72.1%
56.8%
SWE-bench Pro
54.4%
77.3%
Terminal-Bench 2.0
60.0%