VisualWebArena
5 models tested · Updated 2026-04-09 · Verified sources only
Gemini 3 Pro leads at 49.0%
1
Google DeepMind · arxiv/2604.07776 · 2026-04-09
Top score on VisualWebArena. Tested as web agent.
49.0%
2
Google DeepMind · arxiv/2604.07776 · 2026-04-09
Tested as web agent. Second-best after Gemini 3 Pro on VWA.
47.9%
3
Qwen · arxiv/2604.07776 · 2026-04-09
Tested as web agent in structured distillation paper.
37.4%
4
Ai2 · arxiv/2604.07776 · 2026-04-09
9B open-weight model trained via structured distillation from Gemini 3 Pro. First strong open-weight VWA result.
33.9%
5
Ai2 · arxiv/2604.07776 · 2026-04-09
4B model trained via structured distillation. Strong generalization to visual web tasks.
30.1%