Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

11.4

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
21

MiniMax M2.5

minimax-m2.5

codeprogrammingtool use
MiniMax

53.0

Agentic

0.073.953.056.357.7$0.3 in / $1.2 out
22

Qwen3.5-122B-A10B

qwen3.5-122b-a10b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

51.6

Agentic

64.866.851.641.538.1$0.4 in / $3.2 out
23

GLM-5

glm-5

codeprogrammingtool use
ZZhipu AI

51.3

Agentic

0.022.151.365.330.2$1 in / $3.2 out
24

MiniMax M2.7

minimax-m2.7

codeprogrammingtool use
MiniMax

50.8

Agentic

0.052.850.835.955.0$0.3 in / $1.2 out
25

Qwen3-Coder 480B A35B Instruct

qwen3-coder-480b-a35b-instruct

codeprogrammingtool use
AAlibaba Cloud / Qwen Team

50.7

Agentic

0.00.050.736.60.0
26

GPT-5.2

gpt-5.2-2025-12-11

multimodalvisionmulti-input reasoning
OpenAI

50.3

Agentic

76.971.450.372.426.4
27

Claude Sonnet 4.6

claude-sonnet-4-6

multimodalvisionmulti-input reasoning
Anthropic

49.6

Agentic

66.130.149.668.913.2
28

Kimi K2.5

kimi-k2.5

multimodalvisionmulti-input reasoning
Moonshot AI

49.5

Agentic

68.066.849.548.538.1$0.6 in / $3 out
29

Claude Sonnet 4

claude-sonnet-4-20250514

multimodalvisionmulti-input reasoning
Anthropic

49.4

Agentic

41.00.049.444.90.0
30

Qwen3.6 Plus

qwen3.6-plus

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

49.3

Agentic

71.90.049.362.20.0N/A
31

LongCat-Flash-Chat

longcat-flash-chat

codeprogrammingtool use
Meituan

49.2

Agentic

28.151.949.239.457.7$0.3 in / $1.2 out
32

Claude 3.7 Sonnet

claude-3-7-sonnet-20250219

multimodalvisionmulti-input reasoning
Anthropic

49.0

Agentic

43.730.149.040.113.2
33

Qwen3.5-27B

qwen3.5-27b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

47.5

Agentic

61.966.847.542.443.9$0.3 in / $2.4 out
34

Kimi K2.6

kimi-k2.6

multimodalvisionmulti-input reasoning
Moonshot AI

45.3

Agentic

68.566.845.381.033.3$0.95 in / $4 out
35

Step-3.5-Flash

step-3.5-flash

codeprogrammingtool use
SStepFun

45.3

Agentic

62.363.245.353.082.1$0.1 in / $0.4 out
36

o1

o1-2024-12-17

multimodalvisionmulti-input reasoning
OpenAI

44.7

Agentic

43.141.944.76.711.7$15 in / $60 out
37

Qwen3.5-35B-A3B

qwen3.5-35b-a3b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

44.3

Agentic

57.266.844.334.446.4$0.25 in / $2 out
38

Claude Opus 4.5

claude-opus-4-5-20251101

multimodalvisionmulti-input reasoning
Anthropic

44.2

Agentic

56.330.144.274.210.6
39

Gemini 3 Flash

gemini-3-flash-preview

multimodalvisionmulti-input reasoning
Google

42.5

Agentic

71.384.942.566.638.9
40

Qwen3-Next-80B-A3B-Thinking

qwen3-next-80b-a3b-thinking

textinference
AAlibaba Cloud / Qwen Team

41.7

Agentic

44.96.141.70.051.9$0.15 in / $1.5 out
21

MiniMax M2.5

MiniMax

53.0

$0.3 in / $1.2 out

22
A

Qwen3.5-122B-A10B

Alibaba Cloud / Qwen Team

51.6

$0.4 in / $3.2 out

23
Z

GLM-5

Zhipu AI

51.3

$1 in / $3.2 out

24

Page 2 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

N/A
$1.75 in / $14 out
$3 in / $15 out
N/A
$3 in / $15 out
$5 in / $25 out
$0.5 in / $3 out

MiniMax M2.7

MiniMax

50.8

$0.3 in / $1.2 out

25
A

Qwen3-Coder 480B A35B Instruct

Alibaba Cloud / Qwen Team

50.7

N/A

26

GPT-5.2

OpenAI

50.3

$1.75 in / $14 out

27

Claude Sonnet 4.6

Anthropic

49.6

$3 in / $15 out

28

Kimi K2.5

Moonshot AI

49.5

$0.6 in / $3 out

29

Claude Sonnet 4

Anthropic

49.4

N/A

30
A

Qwen3.6 Plus

Alibaba Cloud / Qwen Team

49.3

N/A

31

LongCat-Flash-Chat

Meituan

49.2

$0.3 in / $1.2 out

32

Claude 3.7 Sonnet

Anthropic

49.0

$3 in / $15 out

33
A

Qwen3.5-27B

Alibaba Cloud / Qwen Team

47.5

$0.3 in / $2.4 out

34

Kimi K2.6

Moonshot AI

45.3

$0.95 in / $4 out

35
S

Step-3.5-Flash

StepFun

45.3

$0.1 in / $0.4 out

36

o1

OpenAI

44.7

$15 in / $60 out

37
A

Qwen3.5-35B-A3B

Alibaba Cloud / Qwen Team

44.3

$0.25 in / $2 out

38

Claude Opus 4.5

Anthropic

44.2

$5 in / $25 out

39

Gemini 3 Flash

Google

42.5

$0.5 in / $3 out

40
A

Qwen3-Next-80B-A3B-Thinking

Alibaba Cloud / Qwen Team

41.7

$0.15 in / $1.5 out