Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
296
Tracked models
27
Providers
253
Benchmarked
32.2
Avg. index
296 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 281 | Qwen3.5-0.8B qwen3.5-0.8b multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 0.0 Inference | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 282 | Qwen3.5-2B qwen3.5-2b multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 0.0 Inference | 14.4 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 283 | Qwen3.5-4B qwen3.5-4b multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 0.0 Inference | 32.1 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 284 | Qwen3.5-9B qwen3.5-9b multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 0.0 Inference | 38.5 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 285 | Qwen3.6-35B-A3B qwen3.6-35b-a3b multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 0.0 Inference | 55.3 | 0.0 | 15.5 | 26.0 | 0.0 | N/A |
| 286 | Qwen3.6 Plus qwen3.6-plus multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 0.0 Inference | 71.6 | 0.0 | 48.3 | 63.2 | 0.0 | N/A |
| 287 | Qwen3-Coder 480B A35B Instruct qwen3-coder-480b-a35b-instruct codeprogrammingtool use | Alibaba Cloud / Qwen Team | 0.0 Inference | 0.0 | 0.0 | 50.7 | 35.8 | 0.0 | |
| 288 | Qwen3-Next-80B-A3B-Base qwen3-next-80b-a3b-base textinference | Alibaba Cloud / Qwen Team | 0.0 Inference | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 289 | Qwen3 VL 32B Instruct qwen3-vl-32b-instruct multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 0.0 Inference | 29.4 | 0.0 | 27.9 | 0.0 | 0.0 | |
| 290 | Qwen3 VL 32B Thinking qwen3-vl-32b-thinking multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 0.0 Inference | 44.3 | 0.0 | 34.6 | 0.0 | 0.0 | |
| 291 | QwQ-32B qwq-32b textinference | Alibaba Cloud / Qwen Team | 0.0 Inference | 28.8 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 292 | Sarvam-105B sarvam-105b codeprogrammingtool use | Sarvam AI | 0.0 Inference | 42.9 | 0.0 | 17.9 | 12.1 | 0.0 | N/A |
| 293 | Sarvam-30B sarvam-30b codeprogrammingtool use | Sarvam AI | 0.0 Inference | 46.4 | 0.0 | 8.2 | 5.2 | 0.0 | N/A |
| 294 | Seed 2.0 Lite seed-2.0-lite multimodalvisionmulti-input reasoning | ByteDance | 0.0 Inference | 57.7 | 0.0 | 0.0 | 49.2 | 0.0 | N/A |
| 295 | Seed 2.0 Pro seed-2.0-pro multimodalvisionmulti-input reasoning | ByteDance | 0.0 Inference | 68.1 | 0.0 | 53.8 | 60.4 | 0.0 | N/A |
| 296 | Step3-VL-10B step3-vl-10b multimodalvisionmulti-input reasoning | StepFun | 0.0 Inference | 47.4 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
Qwen3.5-0.8B
Alibaba Cloud / Qwen Team
0.0
N/A
Qwen3.5-2B
Alibaba Cloud / Qwen Team
0.0
N/A
Qwen3.5-4B
Alibaba Cloud / Qwen Team
0.0
N/A
Page 15 of 15 · 296 models
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| N/A |
| N/A |
| N/A |
Qwen3.5-9B
Alibaba Cloud / Qwen Team
0.0
N/A
Qwen3.6-35B-A3B
Alibaba Cloud / Qwen Team
0.0
N/A
Qwen3.6 Plus
Alibaba Cloud / Qwen Team
0.0
N/A
Qwen3-Coder 480B A35B Instruct
Alibaba Cloud / Qwen Team
0.0
N/A
Qwen3-Next-80B-A3B-Base
Alibaba Cloud / Qwen Team
0.0
N/A
Qwen3 VL 32B Instruct
Alibaba Cloud / Qwen Team
0.0
N/A
Qwen3 VL 32B Thinking
Alibaba Cloud / Qwen Team
0.0
N/A
QwQ-32B
Alibaba Cloud / Qwen Team
0.0
N/A
Sarvam-105B
Sarvam AI
0.0
N/A
Sarvam-30B
Sarvam AI
0.0
N/A
Seed 2.0 Lite
ByteDance
0.0
N/A
Seed 2.0 Pro
ByteDance
0.0
N/A
Step3-VL-10B
StepFun
0.0
N/A