benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
arxiv
GLM-4.1V-9B-Thinking
1 benchmarks
WebVoyager
#16 of 18
66.8%