MathVista
3 models tested · Updated 2026-04-09 · Verified sources only
OpenVLThinkerV2 8B leads at 79.5%
1
UCLA · arxiv/2604.08539 · 2026-04-09
Open-weight 8B VLM surpassing GPT-4o (63.8%) by 15.7 points on MathVista. Demonstrates strong visual math reasoning.
79.5%
2
Vero Team · arxiv/2604.04917 · 2026-04-06
Slightly below thinking base (81.4%) but more consistent across all 30 benchmarks.
79.2%
3
Vero Team · arxiv/2604.04917 · 2026-04-06
+1.5 over base. Consistent improvement across visual reasoning tasks via task-routed rewards.
78.7%