AI Explained on Gemini 3.1 Pro

YouTube · 2026-02-20

"You would be understandably slightly confused to see it being better in all sorts of coding benchmarks, measures of scientific reasoning, and academic reasoning, like GPQA Diamond and Humanity Last Exam respectively, as well as general pattern recognition, ARC-AGI-2. But yet, in a head-to-head on GDP val, it falls seemingly quite far behind Claude Opus 4.6."

AI Explained

AI analysis YouTube channel

GPQA Diamond Gemini 3.1 Pro

view original source → all researcher takes →