benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
YouTube · 2026-04-04
"I do remember with the ARC-AGI puzzles trying frontier models getting strange results. It can't see the starting state accurately. It can't accurately define which boxes are colored what colors."
Joseph Nelson
CEO of Roboflow
ARC-AGI 2
view original source →
all researcher takes →