WeMath
2 models tested · Updated 2026-04-14 · Verified sources only
Qwen3.5 27B leads at 84.0%
1
Qwen · arxiv/2604.08644 · 2026-04-14
From EXAONE 4.5 technical report. Best among sub-33B models.
84.0%
2
LG AI Research · HuggingFace/LGAI-EXAONE · 2026-04-14
Significantly beats GPT-5 mini (70.3) and Qwen3-VL 32B (71.6).
79.1%