ARC-AGI 3
4 models tested · Updated 2026-03-24 · Verified sources only
Gemini 3.1 Pro Preview leads at 0.37%
1
Google · ARC Prize/arcprize.org · 2026-03-24
Highest score on new interactive reasoning benchmark. Humans score 100%. ARC-AGI-3 uses turn-based game environments with no instructions.
0.37%
2
OpenAI · ARC Prize/arcprize.org · 2026-03-24
Second-highest on ARC-AGI-3. All frontier models score below 1% on this new interactive reasoning benchmark.
0.26%
3
Anthropic · ARC Prize/arcprize.org · 2026-03-24
Third on ARC-AGI-3. The gap between humans (100%) and AI (best 0.37%) on interactive reasoning remains enormous.
0.25%
4
xAI · ARC Prize Foundation · 2026-03-25
Only frontier model to score exactly 0% on ARC-AGI 3 at launch. Exceeded action cutoff on every level. Humans score 100%.
0.0%