Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
309
Tracked models
27
Providers
264
Benchmarked
11.8
Avg. index
309 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 61 | GLM-4.7 glm-4.7 multimodalvisionmulti-input reasoning | Zhipu AI | 28.0 Agentic | 62.3 | 0.0 | 28.0 | 43.6 | 0.0 | N/A |
| 62 | GPT-5 gpt-5-2025-08-07 multimodalvisionmulti-input reasoning | OpenAI | 27.5 Agentic | 63.8 | 0.0 | 27.5 | 50.6 | 0.0 | N/A |
| 63 | Qwen3 VL 32B Instruct qwen3-vl-32b-instruct multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 27.2 Agentic | 28.6 | 0.0 | 27.2 | 0.0 | 0.0 | |
| 64 | GPT OSS 120B gpt-oss-120b textinference | OpenAI | 26.8 Agentic | 34.9 | 14.6 | 26.8 | 0.0 | 90.5 | $0.09 in / $0.45 out |
| 65 | MiniMax M1 40K minimax-m1-40k codeprogrammingtool use | MiniMax | 26.8 Agentic | 21.8 | 0.0 | 26.8 | 16.6 | 0.0 | N/A |
| 66 | Qwen3-235B-A22B-Thinking-2507 qwen3-235b-a22b-thinking-2507 textinference | Alibaba Cloud / Qwen Team | 26.8 Agentic | 45.8 | 0.0 | 26.8 | 0.0 | 0.0 | N/A |
| 67 | Qwen3 VL 8B Instruct qwen3-vl-8b-instruct multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 26.4 Agentic | 8.8 | 41.1 | 26.4 | 0.0 | 87.3 | $0.08 in / $0.5 out |
| 68 | MiMo-V2-Flash mimo-v2-flash codeprogrammingtool use | Xiaomi | 25.3 Agentic | 52.4 | 0.0 | 25.3 | 36.8 | 0.0 | N/A |
| 69 | GLM-4.5-Air glm-4.5-air codeprogrammingtool use | Zhipu AI | 24.3 Agentic | 26.7 | 0.0 | 24.3 | 18.3 | 0.0 | N/A |
| 70 | Qwen3 VL 8B Thinking qwen3-vl-8b-thinking multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 23.3 Agentic | 34.1 | 41.1 | 23.3 | 0.0 | 54.4 | |
| 71 | Qwen3 VL 30B A3B Instruct qwen3-vl-30b-a3b-instruct multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 22.9 Agentic | 27.0 | 0.0 | 22.9 | 0.0 | 0.0 | |
| 72 | GPT-5.4 Mini gpt-5.4-mini texttext-to-textlanguage | OpenAI | 22.4 Agentic | 56.9 | 56.3 | 22.4 | 27.4 | 32.9 | |
| 73 | MiniMax M1 80K minimax-m1-80k codeprogrammingtool use | MiniMax | 20.9 Agentic | 23.4 | 0.0 | 20.9 | 17.4 | 0.0 | N/A |
| 74 | Qwen3 VL 30B A3B Thinking qwen3-vl-30b-a3b-thinking multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 20.7 Agentic | 34.1 | 0.0 | 20.7 | 0.0 | 0.0 | |
| 75 | o3 o3-2025-04-16 multimodalvisionmulti-input reasoning | OpenAI | 19.8 Agentic | 45.7 | 0.0 | 19.8 | 29.9 | 0.0 | N/A |
| 76 | Qwen3 VL 4B Instruct qwen3-vl-4b-instruct multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 18.8 Agentic | 18.9 | 41.1 | 18.8 | 0.0 | 81.0 | |
| 77 | Qwen3 VL 4B Thinking qwen3-vl-4b-thinking multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 18.6 Agentic | 21.6 | 41.1 | 18.6 | 0.0 | 73.4 | |
| 78 | Sarvam-105B sarvam-105b codeprogrammingtool use | Sarvam AI | 18.3 Agentic | 42.1 | 0.0 | 18.3 | 11.1 | 0.0 | N/A |
| 79 | Qwen3-Next-80B-A3B-Instruct qwen3-next-80b-a3b-instruct textinference | Alibaba Cloud / Qwen Team | 17.9 Agentic | 28.4 | 0.0 | 17.9 | 0.0 | 0.0 | N/A |
| 80 | Mistral Medium 3.5 mistral-medium-3-5 multimodalvisionmulti-input reasoning | Mistral AI | 16.8 Agentic | 34.9 | 28.5 | 16.8 | 61.7 | 29.1 |
GLM-4.7
Zhipu AI
28.0
N/A
GPT-5
OpenAI
27.5
N/A
Qwen3 VL 32B Instruct
Alibaba Cloud / Qwen Team
27.2
N/A
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| N/A |
| $0.18 in / $2.09 out |
| N/A |
| $0.75 in / $4.5 out |
| N/A |
| $0.1 in / $0.6 out |
| $0.1 in / $1 out |
| $1.5 in / $7.5 out |
GPT OSS 120B
OpenAI
26.8
$0.09 in / $0.45 out
MiniMax M1 40K
MiniMax
26.8
N/A
Qwen3-235B-A22B-Thinking-2507
Alibaba Cloud / Qwen Team
26.8
N/A
Qwen3 VL 8B Instruct
Alibaba Cloud / Qwen Team
26.4
$0.08 in / $0.5 out
MiMo-V2-Flash
Xiaomi
25.3
N/A
GLM-4.5-Air
Zhipu AI
24.3
N/A
Qwen3 VL 8B Thinking
Alibaba Cloud / Qwen Team
23.3
$0.18 in / $2.09 out
Qwen3 VL 30B A3B Instruct
Alibaba Cloud / Qwen Team
22.9
N/A
MiniMax M1 80K
MiniMax
20.9
N/A
Qwen3 VL 30B A3B Thinking
Alibaba Cloud / Qwen Team
20.7
N/A
o3
OpenAI
19.8
N/A
Qwen3 VL 4B Instruct
Alibaba Cloud / Qwen Team
18.8
$0.1 in / $0.6 out
Qwen3 VL 4B Thinking
Alibaba Cloud / Qwen Team
18.6
$0.1 in / $1 out
Sarvam-105B
Sarvam AI
18.3
N/A
Qwen3-Next-80B-A3B-Instruct
Alibaba Cloud / Qwen Team
17.9
N/A
Mistral Medium 3.5
Mistral AI
16.8
$1.5 in / $7.5 out