"On CyberGym, the standard benchmark for vulnerability reproduction, Mythos preview scores a 83.1%. The previous best Anthropic model Opus 4.6 is already at 66.6%."
Omshri
AI Infra Weekly host
CyberGymClaude Mythos Preview