benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
articles
Aider Polyglot leaderboard
Aider Polyglot
7 models tested · Updated 2025-08-07 · Verified sources only
GPT-5
leads at
88.0%
1
GPT-5
OpenAI ·
aider.chat/leaderboard
· 2025-08-07
Top score on 225 Exercism exercises across 6 languages. Uses high reasoning setting.
88.0%
2
o3-pro
OpenAI ·
aider.chat/leaderboard
· 2025-06-10
Second highest on Aider polyglot. High reasoning mode.
84.9%
3
Gemini 2.5 Pro
Google ·
aider.chat/leaderboard
· 2025-03-25
Third on Aider polyglot with 32k think budget.
83.1%
4
Grok 4
xAI ·
Aider.chat Leaderboard
· 2026-04-01
Fourth on Aider polyglot leaderboard. Strong multi-language code editing.
79.6%
5
DeepSeek V3.2-Exp (Reasoner)
DeepSeek ·
Aider/aider.chat leaderboard
· 2025-12-01
Best open-weight Aider Polyglot score. 22x cheaper than GPT-5 per run. Evaluated by aider.chat leaderboard on 225 Exercism exercises.
74.2%
6
Claude Opus 4.6
Anthropic ·
aider.chat/leaderboard
· 2026-02-05
32k thinking budget. Gap to GPT-5 highlights agentic coding is Anthropic weakness vs OpenAI.
72.0%
7
DeepSeek V3.2-Exp (Chat)
DeepSeek ·
Aider/aider.chat leaderboard
· 2025-12-01
Chat mode variant, 4 points below Reasoner mode. Cost-efficient at $0.88/run.
70.2%