GPT-4.1-mini + PRISM-MCTS
2 benchmarks
GPQA Diamond
#68 of 92
65.08%
AIME
#63 of 103
53.33%