Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
294
Tracked models
27
Providers
251
Benchmarked
30.7
Avg. index
294 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 41 | GPT OSS 120B High gpt-oss-120b-high multimodalvisionmulti-input reasoning | OpenAI | 73.2 Value / Price | 44.9 | 57.3 | 0.0 | 0.0 | 73.2 | $0.1 in / $0.5 out |
| 42 | Pixtral-12B pixtral-12b-2409 multimodalvisionmulti-input reasoning | Mistral AI | 72.9 Value / Price | 8.1 | 7.1 | 0.0 | 0.0 | 72.9 | |
| 43 | Jamba 1.5 Mini jamba-1.5-mini textinference | AI21 Labs | 72.4 Value / Price | 4.8 | 65.2 | 0.0 | 0.0 | 72.4 | $0.2 in / $0.4 out |
| 44 | GLM-4.7-Flash glm-4.7-flash codeprogrammingtool use | Zhipu AI | 72.2 Value / Price | 38.5 | 29.1 | 12.0 | 21.2 | 72.2 | $0.07 in / $0.4 out |
| 45 | Gemini 1.5 Flash gemini-1.5-flash multimodalvisionmulti-input reasoning | Google | 71.9 Value / Price | 23.2 | 92.1 | 0.0 | 0.0 | 71.9 | |
| 46 | Llama 3.1 70B Instruct llama-3.1-70b-instruct textinference | Meta | 71.9 Value / Price | 11.3 | 20.9 | 0.0 | 0.0 | 71.9 | $0.2 in / $0.2 out |
| 47 | Llama 3.3 70B Instruct llama-3.3-70b-instruct textinference | Meta | 71.9 Value / Price | 19.8 | 20.9 | 0.0 | 0.0 | 71.9 | $0.2 in / $0.2 out |
| 48 | Qwen3 VL 4B Instruct qwen3-vl-4b-instruct multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 70.6 Value / Price | 19.7 | 66.8 | 19.5 | 0.0 | 70.6 | |
| 49 | DeepSeek-V3.2 (Non-thinking) deepseek-chat textinference | DeepSeek | 70.1 Value / Price | 0.0 | 57.3 | 0.0 | 0.0 | 70.1 | $0.28 in / $0.42 out |
| 50 | DeepSeek-V3.2 deepseek-v3.2 codeprogrammingtool use | DeepSeek | 70.0 Value / Price | 58.1 | 52.5 | 16.6 | 45.9 | 70.0 | $0.26 in / $0.38 out |
| 51 | Mercury 2 mercury-2 codeprogrammingtool use | Inception | 69.2 Value / Price | 44.6 | 72.5 | 0.0 | 22.3 | 69.2 | $0.25 in / $0.75 out |
| 52 | Grok-4.1 Fast Non-Reasoning grok-4-1-fast-non-reasoning multimodalvisionmulti-input reasoning | xAI | 67.2 Value / Price | 0.0 | 68.2 | 0.0 | 0.0 | 67.2 | |
| 53 | Grok-4.1 Fast Reasoning grok-4-1-fast-reasoning multimodalvisionmulti-input reasoning | xAI | 67.2 Value / Price | 0.0 | 68.2 | 0.0 | 0.0 | 67.2 | |
| 54 | Grok 4 Fast grok-4-fast multimodalvisionmulti-input reasoning | xAI | 67.2 Value / Price | 58.0 | 68.2 | 15.4 | 0.0 | 67.2 | $0.2 in / $0.5 out |
| 55 | Grok-4 Fast Non-Reasoning grok-4-fast-non-reasoning multimodalvisionmulti-input reasoning | xAI | 67.2 Value / Price | 0.0 | 68.2 | 0.0 | 0.0 | 67.2 | |
| 56 | Grok-4 Fast Reasoning grok-4-fast-reasoning multimodalvisionmulti-input reasoning | xAI | 67.2 Value / Price | 0.0 | 68.2 | 0.0 | 0.0 | 67.2 | |
| 57 | Mistral Small 4 mistral-small-latest multimodalvisionmulti-input reasoning | Mistral AI | 66.9 Value / Price | 34.8 | 55.9 | 0.0 | 0.0 | 66.9 | |
| 58 | DeepSeek R1 Distill Llama 70B deepseek-r1-distill-llama-70b textinference | DeepSeek | 66.7 Value / Price | 29.0 | 16.1 | 0.0 | 0.0 | 66.7 | $0.1 in / $0.4 out |
| 59 | GPT-4o mini gpt-4o-mini-2024-07-18 multimodalvisionmulti-input reasoning | OpenAI | 65.1 Value / Price | 14.9 | 44.7 | 0.0 | 0.0 | 65.1 | |
| 60 | Grok-3 Mini grok-3-mini multimodalvisionmulti-input reasoning | xAI | 65.0 Value / Price | 53.4 | 51.9 | 0.0 | 0.0 | 65.0 | $0.3 in / $0.5 out |
GPT OSS 120B High
OpenAI
73.2
$0.1 in / $0.5 out
Pixtral-12B
Mistral AI
72.9
$0.15 in / $0.15 out
Jamba 1.5 Mini
AI21 Labs
72.4
$0.2 in / $0.4 out
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| $0.15 in / $0.15 out |
| $0.15 in / $0.6 out |
| $0.1 in / $0.6 out |
| $0.2 in / $0.5 out |
| $0.2 in / $0.5 out |
| $0.2 in / $0.5 out |
| $0.2 in / $0.5 out |
| $0.15 in / $0.6 out |
| $0.15 in / $0.6 out |
GLM-4.7-Flash
Zhipu AI
72.2
$0.07 in / $0.4 out
Gemini 1.5 Flash
71.9
$0.15 in / $0.6 out
Llama 3.1 70B Instruct
Meta
71.9
$0.2 in / $0.2 out
Llama 3.3 70B Instruct
Meta
71.9
$0.2 in / $0.2 out
Qwen3 VL 4B Instruct
Alibaba Cloud / Qwen Team
70.6
$0.1 in / $0.6 out
DeepSeek-V3.2 (Non-thinking)
DeepSeek
70.1
$0.28 in / $0.42 out
DeepSeek-V3.2
DeepSeek
70.0
$0.26 in / $0.38 out
Mercury 2
Inception
69.2
$0.25 in / $0.75 out
Grok-4.1 Fast Non-Reasoning
xAI
67.2
$0.2 in / $0.5 out
Grok-4.1 Fast Reasoning
xAI
67.2
$0.2 in / $0.5 out
Grok 4 Fast
xAI
67.2
$0.2 in / $0.5 out
Grok-4 Fast Non-Reasoning
xAI
67.2
$0.2 in / $0.5 out
Grok-4 Fast Reasoning
xAI
67.2
$0.2 in / $0.5 out
Mistral Small 4
Mistral AI
66.9
$0.15 in / $0.6 out
DeepSeek R1 Distill Llama 70B
DeepSeek
66.7
$0.1 in / $0.4 out
GPT-4o mini
OpenAI
65.1
$0.15 in / $0.6 out
Grok-3 Mini
xAI
65.0
$0.3 in / $0.5 out