benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
MATH-Vision leaderboard
MATH-Vision
9 models tested · Updated 2026-04-02 · Verified sources only
Gemma 4 31B
leads at
85.6%
1
Gemma 4 31B
Google ·
HuggingFace/Google DeepMind
· 2026-04-02
Multimodal math reasoning from images.
85.6%
2
Gemma 4 26B A4B
Google ·
HuggingFace/Google DeepMind
· 2026-04-02
MoE 25.2B total, 3.8B active.
82.4%
3
STEP3-VL 10B
StepFun ·
arxiv/2601.09668
· 2026-01-14
10B model achieves 75.95% on MATH-Vision, competitive with Gemma 4 31B (85.6) at 1/3 the params.
75.95%
4
Vero Q3T-8B
Vero Team ·
arxiv/2604.04917
· 2026-04-06
Highest MATH-Vision among all Vero variants. Thinking backbone excels on visual math.
63.5%
5
Gemma 4 E4B
Google ·
HuggingFace/Google DeepMind
· 2026-04-02
4.5B effective params.
59.5%
6
Vero Mi-7B
Vero Team ·
arxiv/2604.04917
· 2026-04-06
+2.6 over MiMoVL base. Surpasses MiMoVL-7B-RL which uses proprietary recipe.
59.4%
7
Vero Q3I-8B
Vero Team ·
arxiv/2604.04917
· 2026-04-06
+5.1 over base. Open RL matches proprietary VLM training pipelines on visual math.
59.0%
8
OpenVLThinkerV2 8B
Research ·
arxiv/2604.08539
· 2026-04-09
8B model trained with G2RPO reaches new open-source SOTA on MATH-Vision, beating larger models.
53.4%
9
Gemma 4 E2B
Google ·
HuggingFace/Google DeepMind
· 2026-04-02
2.3B effective params.
52.4%