"On the harder SWE-bench Pro with private code bases, GPT flips the lead at 57.7% versus Claude's 45.9. Standard tasks, Claude. Hard tasks, GPT."
Neural Neeraj
YouTube tech commentator
SWE-bench ProGPT-5.4