Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
294
Tracked models
27
Providers
251
Benchmarked
31.8
Avg. index
294 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Grok-4.20 Beta Non-Reasoning grok-4.20-beta-0309-non-reasoning multimodalvisionmulti-input reasoning | xAI | 97.2 Inference | 0.0 | 97.2 | 0.0 | 0.0 | 28.1 | $2 in / $6 out |
| 2 | Grok-4.20 Beta Reasoning grok-4.20-beta-0309-reasoning multimodalvisionmulti-input reasoning | xAI | 97.2 Inference | 0.0 | 97.2 | 0.0 | 0.0 | 28.1 | |
| 3 | Gemini 2.0 Flash gemini-2.0-flash multimodalvisionmulti-input reasoning | Google | 94.1 Inference | 33.4 | 94.1 | 0.0 | 0.0 | 82.7 | |
| 4 | GPT-4.1 nano gpt-4.1-nano-2025-04-14 multimodalvisionmulti-input reasoning | OpenAI | 93.7 Inference | 12.6 | 93.7 | 0.0 | 0.0 | 83.0 | |
| 5 | Llama 4 Scout llama-4-scout multimodalvisionmulti-input reasoning | Meta | 93.0 Inference | 29.2 | 93.0 | 0.0 | 0.0 | 87.2 | $0.08 in / $0.3 out |
| 6 | Gemini 1.5 Flash gemini-1.5-flash multimodalvisionmulti-input reasoning | Google | 92.1 Inference | 23.2 | 92.1 | 0.0 | 0.0 | 71.9 | |
| 7 | Gemini 1.5 Flash 8B gemini-1.5-flash-8b multimodalvisionmulti-input reasoning | Google | 92.1 Inference | 10.4 | 92.1 | 0.0 | 0.0 | 88.3 | |
| 8 | GPT-4.1 mini gpt-4.1-mini-2025-04-14 multimodalvisionmulti-input reasoning | OpenAI | 90.9 Inference | 20.8 | 90.9 | 8.9 | 2.6 | 56.8 | |
| 9 | GPT-5 mini gpt-5-mini-2025-08-07 multimodalvisionmulti-input reasoning | OpenAI | 89.7 Inference | 41.9 | 89.7 | 0.0 | 23.7 | 56.3 | |
| 10 | Gemini 3.1 Flash-Lite gemini-3.1-flash-lite-preview multimodalvisionmulti-input reasoning | Google | 84.9 Inference | 56.3 | 84.9 | 0.0 | 0.0 | 50.6 | |
| 11 | Gemini 3 Flash gemini-3-flash-preview multimodalvisionmulti-input reasoning | Google | 84.9 Inference | 71.3 | 84.9 | 42.5 | 66.6 | 38.9 | |
| 12 | GPT-5.5 gpt-5.5 multimodalvisionmulti-input reasoning | OpenAI | 84.9 Inference | 80.3 | 84.9 | 76.2 | 65.4 | 6.7 | $5 in / $30 out |
| 13 | GPT-5.5 Pro gpt-5.5-pro multimodalvisionmulti-input reasoning | OpenAI | 84.9 Inference | 67.8 | 84.9 | 71.8 | 59.1 | 0.6 | $30 in / $180 out |
| 14 | MiMo-V2-Pro mimo-v2-pro codeprogrammingtool use | Xiaomi | 84.9 Inference | 0.0 | 84.9 | 0.0 | 66.6 | 36.4 | $1 in / $3 out |
| 15 | MiniMax M1 80K minimax-m1-80k codeprogrammingtool use | MiniMax | 84.9 Inference | 24.6 | 84.9 | 20.9 | 19.4 | 41.7 | $0.55 in / $2.2 out |
| 16 | Ministral 3 (8B Reasoning 2512) ministral-8b-latest multimodalvisionmulti-input reasoning | Mistral AI | 84.8 Inference | 31.8 | 84.8 | 0.0 | 0.0 | 92.1 | |
| 17 | LongCat-Flash-Lite longcat-flash-lite codeprogrammingtool use | Meituan | 83.8 Inference | 24.7 | 83.8 | 29.5 | 25.3 | 83.3 | $0.1 in / $0.4 out |
| 18 | MiMo-V2-Flash mimo-v2-flash codeprogrammingtool use | Xiaomi | 79.8 Inference | 53.7 | 79.8 | 27.2 | 39.3 | 85.9 | $0.1 in / $0.3 out |
| 19 | Min istral 3 (3B Reasoning 2512) ministral-3b-latest multimodalvisionmulti-input reasoning | Mistral AI | 79.7 Inference | 22.1 | 79.7 | 0.0 | 0.0 | 95.8 | |
| 20 | GPT-5.4 Mini gpt-5.4-mini texttext-to-textlanguage | OpenAI | 77.4 Inference | 57.4 | 77.4 | 27.1 | 26.9 | 32.8 |
Grok-4.20 Beta Non-Reasoning
xAI
97.2
$2 in / $6 out
Grok-4.20 Beta Reasoning
xAI
97.2
$2 in / $6 out
Gemini 2.0 Flash
94.1
$0.1 in / $0.4 out
Page 1 of 15 · 294 models
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| $2 in / $6 out |
| $0.1 in / $0.4 out |
| $0.1 in / $0.4 out |
| $0.15 in / $0.6 out |
| $0.07 in / $0.3 out |
| $0.4 in / $1.6 out |
| $0.25 in / $2 out |
| $0.25 in / $1.5 out |
| $0.5 in / $3 out |
| $0.15 in / $0.15 out |
| $0.1 in / $0.1 out |
| $0.75 in / $4.5 out |
GPT-4.1 nano
OpenAI
93.7
$0.1 in / $0.4 out
Llama 4 Scout
Meta
93.0
$0.08 in / $0.3 out
Gemini 1.5 Flash
92.1
$0.15 in / $0.6 out
Gemini 1.5 Flash 8B
92.1
$0.07 in / $0.3 out
GPT-4.1 mini
OpenAI
90.9
$0.4 in / $1.6 out
GPT-5 mini
OpenAI
89.7
$0.25 in / $2 out
Gemini 3.1 Flash-Lite
84.9
$0.25 in / $1.5 out
Gemini 3 Flash
84.9
$0.5 in / $3 out
GPT-5.5
OpenAI
84.9
$5 in / $30 out
GPT-5.5 Pro
OpenAI
84.9
$30 in / $180 out
MiMo-V2-Pro
Xiaomi
84.9
$1 in / $3 out
MiniMax M1 80K
MiniMax
84.9
$0.55 in / $2.2 out
Ministral 3 (8B Reasoning 2512)
Mistral AI
84.8
$0.15 in / $0.15 out
LongCat-Flash-Lite
Meituan
83.8
$0.1 in / $0.4 out
MiMo-V2-Flash
Xiaomi
79.8
$0.1 in / $0.3 out
Min istral 3 (3B Reasoning 2512)
Mistral AI
79.7
$0.1 in / $0.1 out