BrowseComp
17 models tested · Updated 2026-03-05 · Verified sources only
GPT-5.4 Pro leads at 89.3%
1
OpenAI · Blog/OpenAI · 2026-03-05
Highest BrowseComp score reported. Pro variant with maximum reasoning.
89.3%
2
Anthropic · Blog/Anthropic · 2026-04-07
New SOTA. Uses 4.9x fewer tokens than Opus 4.6 while scoring higher.
86.9%
3
Google · Blog/Google · 2026-02-19
Up from 59.2% on Gemini 3 Pro. Strong autonomous web research capability.
85.9%
4
Multi-agent web search coordination across hours-long sessions. 86.8% with harness.
84.0%
5
OpenAI · Blog/OpenAI · 2026-03-05
Up from 65.8% on GPT-5.2. Native computer-use model with 1M context.
82.7%
6
Alibaba · HuggingFace/Qwen · 2026-02-16
With aggressive context-folding strategy. Outperforms every US frontier model on web browsing.
78.6%
7
Moonshot AI · Blog/Kimi · 2026-01-27
With Agent Swarm. Base score 60.6, with context management 74.9. Competitive with GPT-5.4 (82.7).
78.4%
8
Moonshot AI · Paper/Moonshot AI · 2026-02-01
Multi-agent swarm configuration. 72.7% on WideSearch.
78.4%
9
MiniMax · Blog/MiniMax · 2026-02-12
Strong browsing agent performance. Competitive with Claude Opus 4.6 (84.0%).
76.3%
10
Zhipu AI · Paper/Zhipu AI (arxiv:2602.15763) · 2026-02-11
With context management strategy. Baseline without CM: 62.0%. Highest open-model BrowseComp score reported.
75.9%
11
Anthropic · Blog/Anthropic · 2026-02-17
Single-agent config. Corrected after cheating detection pipeline update (was 74.72%, adjusted to 74.01%). Multi-agent variant scores 82.07%.
74.0%
12
StepFun · Blog/StepFun · 2026-02-12
Score with Context Manager agent framework. Base score 51.6. Strong browsing for an open model.
69.0%
13
Zhipu AI · Blog/Z.AI · 2026-04-07
Top open-model score on BrowseComp. 79.3 with context management variant.
68.0%
14
Alibaba · HuggingFace/Qwen · 2026-02-16
Small model (27B params) with competitive browsing ability. Below frontier but strong for open-weight.
61.0%
15
Moonshot AI · Paper/Moonshot AI · 2026-02-01
Single-agent mode. Agent Swarm achieves 78.4%.
60.6%
16
Sarvam AI · HuggingFace/sarvamai · 2026-03-06
India's first domestically-trained 105B model. MoE with 10.3B active params. Apache 2.0.
49.5%
17
Zhipu AI · HuggingFace/Zhipu · 2026-01-15
Lightweight flash variant browser capability score.
42.8%