Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

13.2

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
41

DeepSeek-V3.2-Speciale

deepseek-v3.2-speciale

codeprogrammingtool use
DeepSeek

45.9

Programming

54.50.09.745.90.0N/A
42

Claude Sonnet 4

claude-sonnet-4-20250514

multimodalvisionmulti-input reasoning
Anthropic

44.9

Programming

41.00.049.444.90.0
43

Qwen3.6-27B

qwen3.6-27b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

44.6

Programming

59.80.00.044.60.0N/A
44

GLM-4.7

glm-4.7

multimodalvisionmulti-input reasoning
ZZhipu AI

44.5

Programming

63.252.828.244.540.6$0.6 in / $2.2 out
45

MiniMax M2

minimax-m2

codeprogrammingtool use
MiniMax

42.8

Programming

32.255.941.442.852.3$0.3 in / $1.2 out
46

Qwen3.5-27B

qwen3.5-27b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

42.4

Programming

61.966.847.542.443.9$0.3 in / $2.4 out
47

Qwen3.5-122B-A10B

qwen3.5-122b-a10b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

41.5

Programming

64.866.851.641.538.1$0.4 in / $3.2 out
48

Muse Spark

muse-spark

multimodalvisionmulti-input reasoning
MMeta

41.3

Programming

71.00.067.341.30.0N/A
49

GLM-4.5

glm-4.5

codeprogrammingtool use
ZZhipu AI

40.6

Programming

34.30.036.440.60.0N/A
50

DeepSeek-V3.2-Exp

deepseek-v3.2-exp

codeprogrammingtool use
DeepSeek

40.5

Programming

52.70.028.840.50.0N/A
51

GPT-5.2 Codex

gpt-5.2-codex

multimodalvisionmulti-input reasoning
OpenAI

40.4

Programming

0.048.60.040.419.5
52

Claude 3.7 Sonnet

claude-3-7-sonnet-20250219

multimodalvisionmulti-input reasoning
Anthropic

40.1

Programming

43.730.149.040.113.2
53

Grok Code Fast 1

grok-code-fast-1

codeprogrammingtool use
xAI

39.7

Programming

0.047.20.039.749.5$0.2 in / $1.5 out
54

LongCat-Flash-Chat

longcat-flash-chat

codeprogrammingtool use
Meituan

39.4

Programming

28.151.949.239.457.7$0.3 in / $1.2 out
55

MiMo-V2-Flash

mimo-v2-flash

codeprogrammingtool use
Xiaomi

39.3

Programming

53.779.827.239.385.9$0.1 in / $0.3 out
56

LongCat-Flash-Thinking-2601

longcat-flash-thinking-2601

codeprogrammingtool use
Meituan

38.0

Programming

56.351.930.838.057.7
57

Qwen3-Coder 480B A35B Instruct

qwen3-coder-480b-a35b-instruct

codeprogrammingtool use
AAlibaba Cloud / Qwen Team

36.6

Programming

0.00.050.736.60.0
58

Qwen3 Max

qwen3-max

codeprogrammingtool use
AAlibaba Cloud / Qwen Team

36.6

Programming

30.055.90.036.631.7$0.5 in / $5 out
59

MiniMax M2.7

minimax-m2.7

codeprogrammingtool use
MiniMax

35.9

Programming

0.052.850.835.955.0$0.3 in / $1.2 out
60

Qwen3.5-35B-A3B

qwen3.5-35b-a3b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

34.4

Programming

57.266.844.334.446.4$0.25 in / $2 out
41

DeepSeek-V3.2-Speciale

DeepSeek

45.9

N/A

42

Claude Sonnet 4

Anthropic

44.9

N/A

43
A

Qwen3.6-27B

Alibaba Cloud / Qwen Team

44.6

N/A

44

Page 3 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

N/A
$1.75 in / $14 out
$3 in / $15 out
$0.3 in / $1.2 out
N/A
Z

GLM-4.7

Zhipu AI

44.5

$0.6 in / $2.2 out

45

MiniMax M2

MiniMax

42.8

$0.3 in / $1.2 out

46
A

Qwen3.5-27B

Alibaba Cloud / Qwen Team

42.4

$0.3 in / $2.4 out

47
A

Qwen3.5-122B-A10B

Alibaba Cloud / Qwen Team

41.5

$0.4 in / $3.2 out

48
M

Muse Spark

Meta

41.3

N/A

49
Z

GLM-4.5

Zhipu AI

40.6

N/A

50

DeepSeek-V3.2-Exp

DeepSeek

40.5

N/A

51

GPT-5.2 Codex

OpenAI

40.4

$1.75 in / $14 out

52

Claude 3.7 Sonnet

Anthropic

40.1

$3 in / $15 out

53

Grok Code Fast 1

xAI

39.7

$0.2 in / $1.5 out

54

LongCat-Flash-Chat

Meituan

39.4

$0.3 in / $1.2 out

55

MiMo-V2-Flash

Xiaomi

39.3

$0.1 in / $0.3 out

56

LongCat-Flash-Thinking-2601

Meituan

38.0

$0.3 in / $1.2 out

57
A

Qwen3-Coder 480B A35B Instruct

Alibaba Cloud / Qwen Team

36.6

N/A

58
A

Qwen3 Max

Alibaba Cloud / Qwen Team

36.6

$0.5 in / $5 out

59

MiniMax M2.7

MiniMax

35.9

$0.3 in / $1.2 out

60
A

Qwen3.5-35B-A3B

Alibaba Cloud / Qwen Team

34.4

$0.25 in / $2 out