benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
AndroidWorld leaderboard
AndroidWorld
3 models tested · Updated 2025-11-25 · Verified sources only
Surfer 2
leads at
87.1%
1
Surfer 2
H Company ·
Blog/H Company
· 2025-11-25
SOTA on mobile agent tasks, surpassing human baseline of 80%.
87.1%
2
GLM-5V-Turbo
Zhipu AI ·
X/@VaibhavSisinty + Z.ai launch
· 2026-04-01
First vision-language model to lead AndroidWorld. Beats Claude Opus 4.6 (62.0) by 13.7 points on Android GUI tasks.
75.7%
3
Qwen3.5 27B
Alibaba ·
HuggingFace/Qwen
· 2026-02-16
Mobile agent benchmark. Competitive with specialized agent frameworks.
64.2%