benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
ARC-AGI 3 leaderboard
ARC-AGI 3
4 models tested · Updated 2026-03-24 · Verified sources only
Gemini 3.1 Pro Preview
leads at
0.37%
1
Gemini 3.1 Pro Preview
Google ·
ARC Prize/arcprize.org
· 2026-03-24
Highest score on new interactive reasoning benchmark. Humans score 100%. ARC-AGI-3 uses turn-based game environments with no instructions.
0.37%
2
GPT-5.4
OpenAI ·
ARC Prize/arcprize.org
· 2026-03-24
Second-highest on ARC-AGI-3. All frontier models score below 1% on this new interactive reasoning benchmark.
0.26%
3
Claude Opus 4.6
Anthropic ·
ARC Prize/arcprize.org
· 2026-03-24
Third on ARC-AGI-3. The gap between humans (100%) and AI (best 0.37%) on interactive reasoning remains enormous.
0.25%
4
Grok 4.20
xAI ·
ARC Prize Foundation
· 2026-03-25
Only frontier model to score exactly 0% on ARC-AGI 3 at launch. Exceeded action cutoff on every level. Humans score 100%.
0.0%