benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
head to head
Claude Opus 4.6
vs
DASD-4B-Thinking
3 shared benchmarks
3
wins
0
ties
0
wins
100.0%
AIME
83.3%
91.3%
GPQA Diamond
68.4%
76.0%
LiveCodeBench
69.3%