benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
arxiv
GRPO (Qwen2.5-Math-7B)
1 benchmarks
AIME
#88 of 103
25.1%