"You should build an on-policy harness which is already well within distribution and you modify it from there. But if you build off-policy it is not that useful."
Ryan Lopopolo
Engineer, OpenAI Frontier team