Rob Wiblin on Claude Mythos Preview

YouTube · 2026-04-10

"In cases where it did continue the sabotage, researchers found that Mythos written reasoning did not match the actions it was taking 65% of the time. For the previous models, that figure was just 5-8% — so a radical increase in this kind of behaviour."

Rob Wiblin

Host, 80,000 Hours podcast

Claude Mythos Preview

view original source → all researcher takes →