Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

296

Tracked models

27

Providers

253

Benchmarked

32.1

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

296 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
161

Gemma 3 27B

gemma-3-27b-it

multimodalvisionmulti-input reasoning
Google

20.3

Inference

10.720.30.00.073.9$0.1 in / $0.2 out
162

Gemma 3 4B

gemma-3-4b-it

multimodalvisionmulti-input reasoning
Google

20.3

Inference

4.520.30.00.082.0$0.02 in / $0.04 out
163

Gemma 3n E4B Instructed

gemma-3n-e4b-it

multimodalvisionmulti-input reasoning
Google

20.3

Inference

1.320.30.00.010.3
164

o1

o1-2024-12-17

multimodalvisionmulti-input reasoning
OpenAI

19.4

Inference

42.919.444.76.54.9$15 in / $60 out
165

ERNIE 4.5

ernie-4.5

textinference
BBaidu

18.8

Inference

24.518.80.00.034.6$0.4 in / $4 out
166

Mistral Large 3

mistral-large-3-2509

multimodalvisionmulti-input reasoning
Mistral AI

18.8

Inference

9.618.80.00.029.1
167

DeepSeek R1 Distill Llama 70B

deepseek-r1-distill-llama-70b

textinference
DeepSeek

16.6

Inference

28.816.60.00.066.6$0.1 in / $0.4 out
168

DeepSeek R1 Distill Qwen 32B

deepseek-r1-distill-qwen-32b

textinference
DeepSeek

16.6

Inference

26.616.60.00.075.9$0.12 in / $0.18 out
169

Qwen2.5 72B Instruct

qwen-2.5-72b-instruct

textinference
AAlibaba Cloud / Qwen Team

15.0

Inference

17.815.00.00.054.5$0.35 in / $0.4 out
170

DeepSeek-R1

deepseek-r1

textinference
DeepSeek

14.3

Inference

0.014.30.00.035.1$0.55 in / $2.19 out
171

DeepSeek-R1-0528

deepseek-r1-0528

codeprogrammingtool use
DeepSeek

14.3

Inference

50.114.30.06.635.1$0.55 in / $2.19 out
172

Qwen3 32B

qwen3-32b

textinference
AAlibaba Cloud / Qwen Team

13.3

Inference

21.413.30.00.069.8$0.1 in / $0.3 out
173

Phi-4-multimodal-instruct

phi-4-multimodal-instruct

multimodalvisionmulti-input reasoning
MMicrosoft

12.3

Inference

8.812.30.00.079.8$0.05 in / $0.1 out
174

Llama 3.2 90B Instruct

llama-3.2-90b-instruct

multimodalvisionmulti-input reasoning
MMeta

11.3

Inference

16.311.30.00.054.9$0.35 in / $0.4 out
175

Phi-3.5-mini-instruct

phi-3.5-mini-instruct

multimodalvisionmulti-input reasoning
MMicrosoft

10.8

Inference

2.710.80.00.077.2$0.1 in / $0.1 out
176

Phi 4

phi-4

textinference
MMicrosoft

9.0

Inference

15.69.00.00.077.2$0.07 in / $0.14 out
177

Ministral 8B Instruct

ministral-8b-instruct-2410

textinference
Mistral AI

7.0

Inference

0.07.00.00.076.1$0.1 in / $0.1 out
178

Pixtral-12B

pixtral-12b-2409

multimodalvisionmulti-input reasoning
Mistral AI

7.0

Inference

8.17.00.00.073.0
179

Pixtral Large

pixtral-large

multimodalvisionmulti-input reasoning
Mistral AI

7.0

Inference

27.87.00.00.022.4
180

Qwen3-Next-80B-A3B-Instruct

qwen3-next-80b-a3b-instruct

textinference
AAlibaba Cloud / Qwen Team

6.1

Inference

29.56.117.90.051.9$0.15 in / $1.5 out
161

Gemma 3 27B

Google

20.3

$0.1 in / $0.2 out

162

Gemma 3 4B

Google

20.3

$0.02 in / $0.04 out

163

Gemma 3n E4B Instructed

Google

20.3

$20 in / $40 out

164

Page 9 of 15 · 296 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$20 in / $40 out
$2 in / $5 out
$0.15 in / $0.15 out
$2 in / $6 out

o1

OpenAI

19.4

$15 in / $60 out

165
B

ERNIE 4.5

Baidu

18.8

$0.4 in / $4 out

166

Mistral Large 3

Mistral AI

18.8

$2 in / $5 out

167

DeepSeek R1 Distill Llama 70B

DeepSeek

16.6

$0.1 in / $0.4 out

168

DeepSeek R1 Distill Qwen 32B

DeepSeek

16.6

$0.12 in / $0.18 out

169
A

Qwen2.5 72B Instruct

Alibaba Cloud / Qwen Team

15.0

$0.35 in / $0.4 out

170

DeepSeek-R1

DeepSeek

14.3

$0.55 in / $2.19 out

171

DeepSeek-R1-0528

DeepSeek

14.3

$0.55 in / $2.19 out

172
A

Qwen3 32B

Alibaba Cloud / Qwen Team

13.3

$0.1 in / $0.3 out

173
M

Phi-4-multimodal-instruct

Microsoft

12.3

$0.05 in / $0.1 out

174
M

Llama 3.2 90B Instruct

Meta

11.3

$0.35 in / $0.4 out

175
M

Phi-3.5-mini-instruct

Microsoft

10.8

$0.1 in / $0.1 out

176
M

Phi 4

Microsoft

9.0

$0.07 in / $0.14 out

177

Ministral 8B Instruct

Mistral AI

7.0

$0.1 in / $0.1 out

178

Pixtral-12B

Mistral AI

7.0

$0.15 in / $0.15 out

179

Pixtral Large

Mistral AI

7.0

$2 in / $6 out

180
A

Qwen3-Next-80B-A3B-Instruct

Alibaba Cloud / Qwen Team

6.1

$0.15 in / $1.5 out