Qwen3-Max-Thinking
1 benchmarks
Humanity's Last Exam
#2 of 24
58.3%