Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
296
Tracked models
27
Providers
253
Benchmarked
13.4
Avg. index
296 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 81 | MiniMax M1 40K minimax-m1-40k codeprogrammingtool use | MiniMax | 18.1 Programming | 22.6 | 0.0 | 26.8 | 18.1 | 0.0 | N/A |
| 82 | GPT-4.1 gpt-4.1-2025-04-14 multimodalvisionmulti-input reasoning | OpenAI | 17.3 Programming | 28.7 | 75.9 | 32.8 | 17.3 | 34.6 | |
| 83 | Kimi K2 Instruct kimi-k2-instruct codeprogrammingtool use | Moonshot AI | 15.3 Programming | 24.4 | 46.1 | 14.8 | 15.3 | 62.1 | |
| 84 | Devstral Small 1.1 devstral-small-2507 codeprogrammingtool use | Mistral AI | 14.7 Programming | 0.0 | 64.8 | 0.0 | 14.7 | 85.3 | |
| 85 | Claude 3.5 Sonnet claude-3-5-sonnet-20241022 multimodalvisionmulti-input reasoning | Anthropic | 12.9 Programming | 33.7 | 68.2 | 38.7 | 12.9 | 24.6 | |
| 86 | o3-mini o3-mini codeprogrammingtool use | OpenAI | 12.2 Programming | 25.6 | 70.4 | 11.9 | 12.2 | 41.6 | $1.1 in / $4.4 out |
| 87 | Sarvam-105B sarvam-105b codeprogrammingtool use | Sarvam AI | 12.1 Programming | 42.9 | 0.0 | 17.9 | 12.1 | 0.0 | N/A |
| 88 | GPT-5 nano gpt-5-nano-2025-08-07 multimodalvisionmulti-input reasoning | OpenAI | 11.8 Programming | 26.3 | 0.0 | 0.0 | 11.8 | 0.0 | |
| 89 | DeepSeek-V3 deepseek-v3 codeprogrammingtool use | DeepSeek | 10.4 Programming | 27.3 | 58.0 | 0.0 | 10.4 | 60.5 | $0.27 in / $1.1 out |
| 90 | GPT-5.4 nano gpt-5.4-nano multimodalvisionmulti-input reasoning | OpenAI | 10.0 Programming | 45.6 | 76.5 | 9.7 | 10.0 | 57.1 | |
| 91 | o1-preview o1-preview codeprogrammingtool use | OpenAI | 9.5 Programming | 41.8 | 33.0 | 0.0 | 9.5 | 11.8 | $15 in / $60 out |
| 92 | Claude 3.5 Haiku claude-3-5-haiku-20241022 codeprogrammingtool use | Anthropic | 7.8 Programming | 10.8 | 30.5 | 3.0 | 7.8 | 31.8 | |
| 93 | DeepSeek-R1-0528 deepseek-r1-0528 codeprogrammingtool use | DeepSeek | 6.6 Programming | 50.1 | 14.3 | 0.0 | 6.6 | 35.1 | $0.55 in / $2.19 out |
| 94 | o1 o1-2024-12-17 multimodalvisionmulti-input reasoning | OpenAI | 6.5 Programming | 42.9 | 19.4 | 44.7 | 6.5 | 4.9 | $15 in / $60 out |
| 95 | GPT-4.5 gpt-4.5 multimodalvisionmulti-input reasoning | OpenAI | 6.0 Programming | 41.9 | 29.7 | 35.8 | 6.0 | 7.0 | $75 in / $150 out |
| 96 | Sarvam-30B sarvam-30b codeprogrammingtool use | Sarvam AI | 5.2 Programming | 46.4 | 0.0 | 8.2 | 5.2 | 0.0 | N/A |
| 97 | Nemotron 3 Nano (30B A3B) nemotron-3-nano-30b-a3b codeprogrammingtool use | NVIDIA | 4.4 Programming | 45.4 | 66.0 | 3.3 | 4.4 | 90.9 | $0.06 in / $0.24 out |
| 98 | GPT-4o gpt-4o-2024-08-06 multimodalvisionmulti-input reasoning | OpenAI | 4.3 Programming | 31.5 | 46.7 | 14.9 | 4.3 | 26.8 | |
| 99 | Gemini 2.5 Flash-Lite gemini-2.5-flash-lite multimodalvisionmulti-input reasoning | Google | 3.5 Programming | 21.4 | 32.8 | 0.0 | 3.5 | 64.1 | |
| 100 | GPT-4.1 mini gpt-4.1-mini-2025-04-14 multimodalvisionmulti-input reasoning | OpenAI | 2.6 Programming | 20.7 | 90.6 | 8.9 | 2.6 | 56.8 |
MiniMax M1 40K
MiniMax
18.1
N/A
GPT-4.1
OpenAI
17.3
$2 in / $8 out
Kimi K2 Instruct
Moonshot AI
15.3
$0.5 in / $0.5 out
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| $2 in / $8 out |
| $0.5 in / $0.5 out |
| $0.1 in / $0.3 out |
| $3 in / $15 out |
| N/A |
| $0.2 in / $1.25 out |
| $0.8 in / $4 out |
| $2.5 in / $10 out |
| $0.1 in / $0.4 out |
| $0.4 in / $1.6 out |
Devstral Small 1.1
Mistral AI
14.7
$0.1 in / $0.3 out
Claude 3.5 Sonnet
Anthropic
12.9
$3 in / $15 out
o3-mini
OpenAI
12.2
$1.1 in / $4.4 out
Sarvam-105B
Sarvam AI
12.1
N/A
GPT-5 nano
OpenAI
11.8
N/A
DeepSeek-V3
DeepSeek
10.4
$0.27 in / $1.1 out
GPT-5.4 nano
OpenAI
10.0
$0.2 in / $1.25 out
o1-preview
OpenAI
9.5
$15 in / $60 out
Claude 3.5 Haiku
Anthropic
7.8
$0.8 in / $4 out
DeepSeek-R1-0528
DeepSeek
6.6
$0.55 in / $2.19 out
o1
OpenAI
6.5
$15 in / $60 out
GPT-4.5
OpenAI
6.0
$75 in / $150 out
Sarvam-30B
Sarvam AI
5.2
N/A
Nemotron 3 Nano (30B A3B)
NVIDIA
4.4
$0.06 in / $0.24 out
GPT-4o
OpenAI
4.3
$2.5 in / $10 out
Gemini 2.5 Flash-Lite
3.5
$0.1 in / $0.4 out
GPT-4.1 mini
OpenAI
2.6
$0.4 in / $1.6 out