FrontierMath Leaderboard 2026 — Results Across 11 Real AI Models

New FrontierMath record on Tiers 1-3 (undergrad to postdoc math). Also scored 38% on Tier 4 (research-grade). Solved 2 previously unsolved Tier 4 problems.

50.0%

GPT-5.4

OpenAI · OpenAI Blog · 2026-04-23

Tier 1-3. 4pt below GPT-5.5.

47.6%

Claude Opus 4.7

Anthropic · Blog/OpenAI · 2026-04-16

OpenAI-tested. Strong math but below GPT-5.5 and DeepSeek V4 Pro.

43.8%

Muse Spark

Meta · X/@EpochAIResearch · 2026-04-08

Independent evaluation by Epoch AI on Tiers 1-3 (undergrad to early postdoc). Behind GPT-5.4 Pro (50%) but competitive with other frontier models.

39.0%

Gemini 3.1 Pro

Google · X/@richa_lq · 2026-04-18

Per @richa_lq (26 likes, 2.3k impressions). OpenAI leads FrontierMath at 50%+ but has access to full dataset.

36.9%