YouTube · 2026-04-10
"It saturates all existing ways of testing how good a model is at offensive cyber capabilities. That is to say it scores close to 100%, so those tests can not effectively tell how far its capabilities extend anymore."
Rob Wiblin
Host, 80,000 Hours podcast