Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

296

Tracked models

27

Providers

253

Benchmarked

32.1

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

296 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
101

Grok-3

grok-3

multimodalvisionmulti-input reasoning
xAI

52.7

Inference

59.352.70.00.022.7$3 in / $15 out
102

Grok-3 Mini

grok-3-mini

multimodalvisionmulti-input reasoning
xAI

52.7

Inference

53.152.70.00.065.6$0.3 in / $0.5 out
103

LongCat-Flash-Chat

longcat-flash-chat

codeprogrammingtool use
Meituan

52.7

Inference

27.952.749.239.157.9$0.3 in / $1.2 out
104

LongCat-Flash-Thinking-2601

longcat-flash-thinking-2601

codeprogrammingtool use
Meituan

52.7

Inference

55.752.729.437.157.9
105

Nova Micro

nova-micro

textinference
AAmazon

52.7

Inference

9.152.70.00.091.3$0.03 in / $0.14 out
106

GLM-4.7

glm-4.7

multimodalvisionmulti-input reasoning
ZZhipu AI

52.2

Inference

62.452.227.643.840.7$0.6 in / $2.2 out
107

MiniMax M2.7

minimax-m2.7

codeprogrammingtool use
MiniMax

52.2

Inference

0.052.244.940.154.9$0.3 in / $1.2 out
108

GPT-5.4

gpt-5.4

texttext-to-textlanguage
OpenAI

51.5

Inference

75.951.561.863.918.3
109

GPT-5.1 Codex

gpt-5.1-codex

multimodalvisionmulti-input reasoning
OpenAI

49.0

Inference

0.049.00.050.025.1
110

GPT-5.1 Codex High

gpt-5.1-codex-high

multimodalvisionmulti-input reasoning
OpenAI

49.0

Inference

61.049.00.00.025.1
111

GPT-5.2 Codex

gpt-5.2-codex

multimodalvisionmulti-input reasoning
OpenAI

49.0

Inference

0.049.00.044.119.6
112

GPT-5.3 Codex

gpt-5.3-codex

texttext-to-textcoding
OpenAI

49.0

Inference

0.049.00.052.219.6
113

Grok-4.1 Thinking

grok-4.1-thinking-2025-11-17

multimodalvisionmulti-input reasoning
xAI

48.5

Inference

0.048.50.00.017.8
114

Grok Code Fast 1

grok-code-fast-1

codeprogrammingtool use
xAI

47.7

Inference

0.047.70.038.849.7$0.2 in / $1.5 out
115

GPT-4o

gpt-4o-2024-08-06

multimodalvisionmulti-input reasoning
OpenAI

46.7

Inference

31.546.714.94.326.8
116

DeepSeek-V2.5

deepseek-v2.5

codeprogrammingtool use
DeepSeek

46.5

Inference

0.046.50.00.979.7$0.14 in / $0.28 out
117

GLM-5.1

glm-5.1

codeprogrammingtool use
ZZhipu AI

46.1

Inference

66.846.151.560.230.2$1.4 in / $4.4 out
118

Kimi K2 Instruct

kimi-k2-instruct

codeprogrammingtool use
Moonshot AI

46.1

Inference

24.446.114.815.362.1$0.5 in / $0.5 out
119

GPT-4o

gpt-4o-2024-05-13

multimodalvisionmulti-input reasoning
OpenAI

45.4

Inference

22.345.40.00.026.5
120

GPT-4o mini

gpt-4o-mini-2024-07-18

multimodalvisionmulti-input reasoning
OpenAI

45.4

Inference

14.845.40.00.065.1
101

Grok-3

xAI

52.7

$3 in / $15 out

102

Grok-3 Mini

xAI

52.7

$0.3 in / $0.5 out

103

LongCat-Flash-Chat

Meituan

52.7

$0.3 in / $1.2 out

104

Page 6 of 15 · 296 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$0.3 in / $1.2 out
$2.5 in / $15 out
$1.25 in / $10 out
$1.25 in / $10 out
$1.75 in / $14 out
$1.75 in / $14 out
$3 in / $15 out
$2.5 in / $10 out
$2.5 in / $10 out
$0.15 in / $0.6 out

LongCat-Flash-Thinking-2601

Meituan

52.7

$0.3 in / $1.2 out

105
A

Nova Micro

Amazon

52.7

$0.03 in / $0.14 out

106
Z

GLM-4.7

Zhipu AI

52.2

$0.6 in / $2.2 out

107

MiniMax M2.7

MiniMax

52.2

$0.3 in / $1.2 out

108

GPT-5.4

OpenAI

51.5

$2.5 in / $15 out

109

GPT-5.1 Codex

OpenAI

49.0

$1.25 in / $10 out

110

GPT-5.1 Codex High

OpenAI

49.0

$1.25 in / $10 out

111

GPT-5.2 Codex

OpenAI

49.0

$1.75 in / $14 out

112

GPT-5.3 Codex

OpenAI

49.0

$1.75 in / $14 out

113

Grok-4.1 Thinking

xAI

48.5

$3 in / $15 out

114

Grok Code Fast 1

xAI

47.7

$0.2 in / $1.5 out

115

GPT-4o

OpenAI

46.7

$2.5 in / $10 out

116

DeepSeek-V2.5

DeepSeek

46.5

$0.14 in / $0.28 out

117
Z

GLM-5.1

Zhipu AI

46.1

$1.4 in / $4.4 out

118

Kimi K2 Instruct

Moonshot AI

46.1

$0.5 in / $0.5 out

119

GPT-4o

OpenAI

45.4

$2.5 in / $10 out

120

GPT-4o mini

OpenAI

45.4

$0.15 in / $0.6 out