"Humans, I believe it is 72.4% whereas with OS World verified on GPT 5.4, it is at 75%. And additionally, this model is quite a bit better across a number of different tasks like browse comp, web arena."
Developers Digest
YouTube AI review channel
OSWorldGPT-5.4