WebVoyager
4 models tested · Updated 2026-03-26 · Verified sources only
Alumnium MCP + Claude Code leads at 98.5%
1
Alumnium · Blog/Alumnium · 2026-03-26
New WebVoyager SOTA. MCP server + Claude Code general-purpose agent. ~$5 total cost for 610 tasks. Reproducible code available.
98.5%
2
H Company · Blog/H Company · 2025-10-17
Cross-platform computer-use agent. SOTA on WebVoyager at time of release. Separates planning from execution with orchestrator + sub-agents.
97.1%
3
Zhipu AI · X/@VaibhavSisinty + Z.ai launch · 2026-04-01
Vision-language model with native multimodal input. Marginal lead over Claude Opus 4.6 (88.0) on browser navigation tasks.
88.5%
4
Allen AI · Blog/Allen AI · 2026-03-24
Open-weight 8B browser agent. With test-time scaling (pass@4): 94.7%. Outperforms Fara-7B and GPT-4o-based agents.
78.2%