benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
SimpleQA Verified leaderboard
SimpleQA Verified
1 models tested · Updated 2026-04-08 · Verified sources only
Muse Spark
leads at
66.0%
1
Muse Spark
Meta ·
X/@EpochAIResearch
· 2026-04-08
Independent evaluation by Epoch AI. Factual accuracy benchmark.
66.0%