Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

296

Tracked models

27

Providers

253

Benchmarked

11.5

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

296 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
261

Phi 4

phi-4

textinference
MMicrosoft

0.0

Agentic

15.69.00.00.077.2$0.07 in / $0.14 out
262

Phi 4 Mini

phi-4-mini

textinference
MMicrosoft

0.0

Agentic

2.00.00.00.00.0N/A
263

Phi 4 Mini Reasoning

phi-4-mini-reasoning

textinference
MMicrosoft

0.0

Agentic

21.70.00.00.00.0N/A
264

Phi-4-multimodal-instruct

phi-4-multimodal-instruct

multimodalvisionmulti-input reasoning
MMicrosoft

0.0

Agentic

8.812.30.00.079.9$0.05 in / $0.1 out
265

Phi 4 Reasoning

phi-4-reasoning

textinference
MMicrosoft

0.0

Agentic

23.10.00.00.00.0N/A
266

Phi 4 Reasoning Plus

phi-4-reasoning-plus

textinference
MMicrosoft

0.0

Agentic

31.50.00.00.00.0N/A
267

Pixtral-12B

pixtral-12b-2409

multimodalvisionmulti-input reasoning
Mistral AI

0.0

Agentic

8.17.00.00.073.0
268

Pixtral Large

pixtral-large

multimodalvisionmulti-input reasoning
Mistral AI

0.0

Agentic

27.87.00.00.022.3
269

QvQ-72B-Preview

qvq-72b-preview

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

0.0

Agentic

38.20.00.00.00.0N/A
270

Qwen2.5 14B Instruct

qwen-2.5-14b-instruct

textinference
AAlibaba Cloud / Qwen Team

0.0

Agentic

14.60.00.00.00.0N/A
271

Qwen2.5 32B Instruct

qwen-2.5-32b-instruct

textinference
AAlibaba Cloud / Qwen Team

0.0

Agentic

18.60.00.00.00.0N/A
272

Qwen2.5 72B Instruct

qwen-2.5-72b-instruct

textinference
AAlibaba Cloud / Qwen Team

0.0

Agentic

17.815.00.00.054.6$0.35 in / $0.4 out
273

Qwen2.5 7B Instruct

qwen-2.5-7b-instruct

textinference
AAlibaba Cloud / Qwen Team

0.0

Agentic

7.471.50.00.077.5$0.3 in / $0.3 out
274

Qwen2.5-Coder 32B Instruct

qwen-2.5-coder-32b-instruct

textinference
AAlibaba Cloud / Qwen Team

0.0

Agentic

0.021.10.00.081.4$0.09 in / $0.09 out
275

Qwen2.5-Coder 7B Instruct

qwen-2.5-coder-7b-instruct

textinference
AAlibaba Cloud / Qwen Team

0.0

Agentic

0.00.00.00.00.0N/A
276

Qwen2.5-Omni-7B

qwen2.5-omni-7b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

0.0

Agentic

7.60.00.00.00.0N/A
277

Qwen2.5 VL 7B Instruct

qwen2.5-vl-7b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

0.0

Agentic

9.60.00.00.00.0N/A
278

Qwen2 72B Instruct

qwen2-72b-instruct

textinference
AAlibaba Cloud / Qwen Team

0.0

Agentic

12.00.00.00.00.0N/A
279

Qwen2 7B Instruct

qwen2-7b-instruct

textinference
AAlibaba Cloud / Qwen Team

0.0

Agentic

2.40.00.00.00.0N/A
280

Qwen2-VL-72B-Instruct

qwen2-vl-72b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

0.0

Agentic

9.30.00.00.00.0N/A
261
M

Phi 4

Microsoft

0.0

$0.07 in / $0.14 out

262
M

Phi 4 Mini

Microsoft

0.0

N/A

263
M

Phi 4 Mini Reasoning

Microsoft

0.0

N/A

264

Page 14 of 15 · 296 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$0.15 in / $0.15 out
$2 in / $6 out
M

Phi-4-multimodal-instruct

Microsoft

0.0

$0.05 in / $0.1 out

265
M

Phi 4 Reasoning

Microsoft

0.0

N/A

266
M

Phi 4 Reasoning Plus

Microsoft

0.0

N/A

267

Pixtral-12B

Mistral AI

0.0

$0.15 in / $0.15 out

268

Pixtral Large

Mistral AI

0.0

$2 in / $6 out

269
A

QvQ-72B-Preview

Alibaba Cloud / Qwen Team

0.0

N/A

270
A

Qwen2.5 14B Instruct

Alibaba Cloud / Qwen Team

0.0

N/A

271
A

Qwen2.5 32B Instruct

Alibaba Cloud / Qwen Team

0.0

N/A

272
A

Qwen2.5 72B Instruct

Alibaba Cloud / Qwen Team

0.0

$0.35 in / $0.4 out

273
A

Qwen2.5 7B Instruct

Alibaba Cloud / Qwen Team

0.0

$0.3 in / $0.3 out

274
A

Qwen2.5-Coder 32B Instruct

Alibaba Cloud / Qwen Team

0.0

$0.09 in / $0.09 out

275
A

Qwen2.5-Coder 7B Instruct

Alibaba Cloud / Qwen Team

0.0

N/A

276
A

Qwen2.5-Omni-7B

Alibaba Cloud / Qwen Team

0.0

N/A

277
A

Qwen2.5 VL 7B Instruct

Alibaba Cloud / Qwen Team

0.0

N/A

278
A

Qwen2 72B Instruct

Alibaba Cloud / Qwen Team

0.0

N/A

279
A

Qwen2 7B Instruct

Alibaba Cloud / Qwen Team

0.0

N/A

280
A

Qwen2-VL-72B-Instruct

Alibaba Cloud / Qwen Team

0.0

N/A