Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

296

Tracked models

27

Providers

253

Benchmarked

30.8

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

296 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
121

DeepSeek-R1-0528

deepseek-r1-0528

codeprogrammingtool use
DeepSeek

35.1

Value / Price

50.114.30.06.635.1$0.55 in / $2.19 out
122

ERNIE 4.5

ernie-4.5

textinference
BBaidu

34.6

Value / Price

24.518.80.00.034.6$0.4 in / $4 out
123

GPT-4.1

gpt-4.1-2025-04-14

multimodalvisionmulti-input reasoning
OpenAI

34.6

Value / Price

28.775.932.817.334.6
124

Kimi K2.6

kimi-k2.6

multimodalvisionmulti-input reasoning
Moonshot AI

33.5

Value / Price

68.166.044.281.233.5
125

DeepSeek-V4-Pro-Max

deepseek-v4-pro-max

codeprogrammingtool use
DeepSeek

33.0

Value / Price

67.792.368.658.333.0
126

GPT-5.4 Mini

gpt-5.4-mini

texttext-to-textlanguage
OpenAI

32.4

Value / Price

56.876.523.828.132.4
127

ChatGPT-4o Latest

chatgpt-4o-latest

multimodalvisionmulti-input reasoning
OpenAI

32.0

Value / Price

56.063.80.00.032.0
128

GPT-5.1

gpt-5.1-2025-11-13

multimodalvisionmulti-input reasoning
OpenAI

32.0

Value / Price

64.972.00.056.232.0
129

GPT-5.1 Instant

gpt-5.1-instant-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

32.0

Value / Price

64.972.00.056.232.0
130

Claude 3.5 Haiku

claude-3-5-haiku-20241022

codeprogrammingtool use
Anthropic

31.8

Value / Price

10.830.53.07.831.8
131

Qwen3 Max

qwen3-max

codeprogrammingtool use
AAlibaba Cloud / Qwen Team

31.3

Value / Price

29.855.20.035.831.3$0.5 in / $5 out
132

GLM-5

glm-5

codeprogrammingtool use
ZZhipu AI

30.6

Value / Price

0.023.047.863.830.6$1 in / $3.2 out
133

GLM-5.1

glm-5.1

codeprogrammingtool use
ZZhipu AI

30.2

Value / Price

66.846.151.560.230.2$1.4 in / $4.4 out
134

o1-mini

o1-mini

textinference
OpenAI

30.1

Value / Price

25.761.30.00.030.1$3 in / $12 out
135

Mistral Large 3

mistral-large-3-2509

multimodalvisionmulti-input reasoning
Mistral AI

29.1

Value / Price

9.618.80.00.029.1
136

GPT-5.1 Medium

gpt-5.1-medium-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

28.9

Value / Price

63.661.90.00.028.9
137

GPT-5 Medium

gpt-5-medium-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

28.9

Value / Price

56.761.90.00.028.9
138

Grok-4.20 Beta Non-Reasoning

grok-4.20-beta-0309-non-reasoning

multimodalvisionmulti-input reasoning
xAI

27.7

Value / Price

0.097.20.00.027.7
139

Grok-4.20 Beta Reasoning

grok-4.20-beta-0309-reasoning

multimodalvisionmulti-input reasoning
xAI

27.7

Value / Price

0.097.20.00.027.7
140

o3

o3-2025-04-16

multimodalvisionmulti-input reasoning
OpenAI

27.7

Value / Price

46.038.919.630.227.7$2 in / $8 out
121

DeepSeek-R1-0528

DeepSeek

35.1

$0.55 in / $2.19 out

122
B

ERNIE 4.5

Baidu

34.6

$0.4 in / $4 out

123

GPT-4.1

OpenAI

34.6

$2 in / $8 out

124

Page 7 of 15 · 296 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$2 in / $8 out
$0.95 in / $4 out
$1.74 in / $3.48 out
$0.75 in / $4.5 out
$2.5 in / $10 out
$1.25 in / $10 out
$1.25 in / $10 out
$0.8 in / $4 out
$2 in / $5 out
$1.25 in / $10 out
$1.25 in / $10 out
$2 in / $6 out
$2 in / $6 out

Kimi K2.6

Moonshot AI

33.5

$0.95 in / $4 out

125

DeepSeek-V4-Pro-Max

DeepSeek

33.0

$1.74 in / $3.48 out

126

GPT-5.4 Mini

OpenAI

32.4

$0.75 in / $4.5 out

127

ChatGPT-4o Latest

OpenAI

32.0

$2.5 in / $10 out

128

GPT-5.1

OpenAI

32.0

$1.25 in / $10 out

129

GPT-5.1 Instant

OpenAI

32.0

$1.25 in / $10 out

130

Claude 3.5 Haiku

Anthropic

31.8

$0.8 in / $4 out

131
A

Qwen3 Max

Alibaba Cloud / Qwen Team

31.3

$0.5 in / $5 out

132
Z

GLM-5

Zhipu AI

30.6

$1 in / $3.2 out

133
Z

GLM-5.1

Zhipu AI

30.2

$1.4 in / $4.4 out

134

o1-mini

OpenAI

30.1

$3 in / $12 out

135

Mistral Large 3

Mistral AI

29.1

$2 in / $5 out

136

GPT-5.1 Medium

OpenAI

28.9

$1.25 in / $10 out

137

GPT-5 Medium

OpenAI

28.9

$1.25 in / $10 out

138

Grok-4.20 Beta Non-Reasoning

xAI

27.7

$2 in / $6 out

139

Grok-4.20 Beta Reasoning

xAI

27.7

$2 in / $6 out

140

o3

OpenAI

27.7

$2 in / $8 out