gian_anthropic on AI benchmarks
1 quotes from AI researchers about benchmarks, models, and evaluation
"Claude Mythos is arguably the biggest step change in AI capabilities since the GPT-4 jump. When Mythos is allowed to think longer, act deeper, and better explore the solution space, it passes 92% of Terminal Bench task attempts."
Gian (formerly Replit, now Anthropic) @gian_anthropic · 2026-04-09 view on x