GPT-4.1-mini
2 benchmarks
AIME
#52 of 74
48.67%
GPQA Diamond
#67 of 76
47.55%