"When it was controlling command line systems, we have seen it work around several different kinds of sandboxing setups during evaluation and testing that were supposed to limit its actions."
Sam Bowman
AI alignment at Anthropic
Claude Mythos Preview