MMMU Pro
15 models tested · Updated 2026-03-05 · Verified sources only
GPT-5.4 leads at 81.2%
1
OpenAI · Blog/OpenAI · 2026-03-05
Visual understanding and reasoning. Without tool use, reasoning effort xhigh.
81.2%
2
Google · Blog/Google · 2025-12-17
Multimodal reasoning at Flash-tier cost. Competitive with Gemini 3 Pro.
81.0%
3
Google · Official/Google DeepMind · 2026-02-19
Multimodal understanding without tools. Strong visual reasoning.
80.5%
4
Meta · Blog/Meta AI · 2026-04-08
Second-best MMMU Pro score, behind Gemini 3.1 Pro Preview (82.4%). Strong multimodal showing for Meta Superintelligence Labs debut model.
80.5%
5
Alibaba · HuggingFace/Qwen · 2026-02-16
Native vision-language model. Strong multimodal reasoning.
79.0%
6
Moonshot AI · Blog/Kimi · 2026-01-27
Strong multimodal reasoning for an open-weight model.
78.5%
7
Google · Model Card/Google · 2026-04-02
Multimodal vision benchmark. Up from 49.7% on Gemma 3 27B.
76.9%
8
Google · Model Card/Google · 2026-03-03
Budget-tier ($0.25/1M input) yet competitive on multimodal benchmarks.
76.8%
9
Alibaba · HuggingFace/Qwen · 2026-02-16
Multimodal understanding. Strong for a small model — competitive with some frontier scores from late 2025.
75.0%
10
Google · Model Card/Google · 2026-04-02
MoE multimodal. 31B dense reaches 76.9%.
73.8%
11
Alibaba · HuggingFace/Qwen · 2026-03-02
9B params. Outperforms Gemini 2.5 Flash-Lite (59.7) on visual reasoning. Strong for its size class.
70.1%
12
Meta · HuggingFace/Meta · 2026-04-05
Official model card score. 17B active params, 128 experts.
59.6%
13
Google · Google Model Card · 2026-04-02
Multimodal understanding from a 4B model. Strong for edge deployment.
52.6%
14
Meta · HuggingFace/Meta · 2026-04-05
Official model card score. 17B active params, 16 experts, 10M context.
52.2%
15
Google · Model Card/Google · 2026-04-02
Multimodal reasoning at 2.3B active params. Natively multimodal from pretraining.
44.2%