benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
head to head
DeepSeek V4 Flash
vs
GLM-5.1
7 shared benchmarks
4
wins
0
ties
3
wins
94.8%
AIME
95.3%
73.2%
BrowseComp
68.0%
88.1%
GPQA Diamond
86.2%
34.8%
Humanity's Last Exam
31.0%
52.6%
SWE-bench Pro
58.4%
79.0%
SWE-bench Verified
77.8%
56.9%
Terminal-Bench 2.0
69.0%