YouTube · 2026-04-10
"The benchmarks they used to rely on to check that Claude could not engage in AI R&D very effectively have now also been saturated. Mythos exceeds top human performance on all of them and is scoring close to 100%."
Rob Wiblin
Host, 80,000 Hours podcast