youtube/@latent_space on AI benchmarks
1 quotes from AI researchers about benchmarks, models, and evaluation
"We started with some of the very first versions of Codex CLI with the Codex mini model, which was obviously much less capable than the ones we have today. Onward to GPT-5, 5.1, 5.2, 5.3, 5.4 — going through all these model generations and seeing their quirks and different working styles also meant we had to adapt the code base to change things up when the model was revved."
Ryan Lopopolo @youtube/@latent_space · 2026-04-07 view on x