SimpleQA Verified
1 models tested · Updated 2026-04-08 · Verified sources only
Muse Spark leads at 66.0%
1
Meta · X/@EpochAIResearch · 2026-04-08
Independent evaluation by Epoch AI. Factual accuracy benchmark.
66.0%