MedXPertQA MM
2 models tested · Updated 2026-04-02 · Verified sources only
Gemma 4 31B leads at 61.3%
1
Google · HuggingFace/Google DeepMind · 2026-04-02
Medical multimodal QA benchmark.
61.3%
2
Google · HuggingFace/Google DeepMind · 2026-04-02
MoE 25.2B total, 3.8B active.
58.1%