benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
arxiv
GPT-O3-CUA
1 benchmarks
OSWorld
#25 of 27
38.1%