benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
Online-Mind2Web leaderboard
Online-Mind2Web
1 models tested · Updated 2026-03-05 · Verified sources only
GPT-5.4
leads at
92.8%
1
GPT-5.4
OpenAI ·
Blog/OpenAI
· 2026-03-05
Screenshot-based observations only. Far ahead of ChatGPT Atlas Agent Mode at 70.9%.
92.8%