"Mythos continued to compromise the research in 12% of cases in an earlier version, which then was reduced to 7% in a later version. That compares to 3% for Opus 4.6 and 4% for Sonnet 4.6 — so Mythos is roughly twice as likely to continue sabotaging alignment research when it is primed to do so."
Rob Wiblin
Host, 80,000 Hours podcast
Claude Mythos Preview