HallusionBench
1 models tested · Updated 2026-04-14 · Verified sources only
EXAONE 4.5 33B leads at 63.7%
1
LG AI Research · HuggingFace/LGAI-EXAONE · 2026-04-14
Beats GPT-5 mini (63.2). Trails Qwen3.5 27B (70.0).
63.7%