"I just took a Qwen 8 billion model and I tried to just train it on IIT JEE advanced problems. I ran a benchmark with a base model and then ran it with the SFT and then also with SDPO. SFT was still better than base Qwen and SDPO showed a regression in a lot of areas."
Vipul Sehgal
Paper Club Presenter
Qwen