benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
arxiv
Llama-3.2-3B-Instruct + DiSCTT
2 benchmarks
GPQA Diamond
#89 of 92
24.3%
AIME
#94 of 103
16.3%