Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

27.4

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
21

Claude Sonnet 4.6

claude-sonnet-4-6

multimodalvisionmulti-input reasoning
Anthropic

66.1

Benchmarks

66.130.149.668.913.2$3 in / $15 out
22

GPT-5.1

gpt-5.1-2025-11-13

multimodalvisionmulti-input reasoning
OpenAI

65.0

Benchmarks

65.071.40.057.231.9
23

GPT-5.1 Instant

gpt-5.1-instant-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

65.0

Benchmarks

65.071.40.057.231.9
24

GPT-5.1 Thinking

gpt-5.1-thinking-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

65.0

Benchmarks

65.055.10.057.227.0
25

Qwen3.5-122B-A10B

qwen3.5-122b-a10b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

64.8

Benchmarks

64.866.851.641.538.1$0.4 in / $3.2 out
26

GPT-5

gpt-5-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

64.4

Benchmarks

64.40.029.051.70.0N/A
27

GPT-5.1 Medium

gpt-5.1-medium-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

63.6

Benchmarks

63.661.60.00.029.0
28

GLM-4.7

glm-4.7

multimodalvisionmulti-input reasoning
ZZhipu AI

63.2

Benchmarks

63.252.828.244.540.6$0.6 in / $2.2 out
29

GPT-5 High

gpt-5-high-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

63.2

Benchmarks

63.20.00.00.00.0
30

Step-3.5-Flash

step-3.5-flash

codeprogrammingtool use
SStepFun

62.3

Benchmarks

62.363.245.353.082.1$0.1 in / $0.4 out
31

Qwen3.5-27B

qwen3.5-27b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

61.9

Benchmarks

61.966.847.542.443.9$0.3 in / $2.4 out
32

GPT-5.1 Codex High

gpt-5.1-codex-high

multimodalvisionmulti-input reasoning
OpenAI

61.0

Benchmarks

61.048.60.00.025.1
33

Qwen3.6-27B

qwen3.6-27b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

59.8

Benchmarks

59.80.00.044.60.0N/A
34

ERNIE 5.0

ernie-5.0

multimodalvisionmulti-input reasoning
BBaidu

59.7

Benchmarks

59.70.00.00.00.0N/A
35

Grok-3

grok-3

multimodalvisionmulti-input reasoning
xAI

59.5

Benchmarks

59.551.90.00.022.6$3 in / $15 out
36

Qwen3.5-397B-A17B

qwen3.5-397b-a17b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

58.6

Benchmarks

58.666.835.660.935.3$0.6 in / $3.6 out
37

DeepSeek-V3.2

deepseek-v3.2

codeprogrammingtool use
DeepSeek

58.1

Benchmarks

58.152.516.645.970.0$0.26 in / $0.38 out
38

Seed 2.0 Lite

seed-2.0-lite

multimodalvisionmulti-input reasoning
BByteDance

58.1

Benchmarks

58.10.00.050.30.0N/A
39

Grok 4 Fast

grok-4-fast

multimodalvisionmulti-input reasoning
xAI

58.0

Benchmarks

58.068.215.40.067.2$0.2 in / $0.5 out
40

GPT-5.4 Mini

gpt-5.4-mini

texttext-to-textlanguage
OpenAI

57.4

Benchmarks

57.477.427.126.932.8
21

Claude Sonnet 4.6

Anthropic

66.1

$3 in / $15 out

22

GPT-5.1

OpenAI

65.0

$1.25 in / $10 out

23

GPT-5.1 Instant

OpenAI

65.0

$1.25 in / $10 out

24

Page 2 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$1.25 in / $10 out
$1.25 in / $10 out
$1.25 in / $10 out
$1.25 in / $10 out
N/A
$1.25 in / $10 out
$0.75 in / $4.5 out

GPT-5.1 Thinking

OpenAI

65.0

$1.25 in / $10 out

25
A

Qwen3.5-122B-A10B

Alibaba Cloud / Qwen Team

64.8

$0.4 in / $3.2 out

26

GPT-5

OpenAI

64.4

N/A

27

GPT-5.1 Medium

OpenAI

63.6

$1.25 in / $10 out

28
Z

GLM-4.7

Zhipu AI

63.2

$0.6 in / $2.2 out

29

GPT-5 High

OpenAI

63.2

N/A

30
S

Step-3.5-Flash

StepFun

62.3

$0.1 in / $0.4 out

31
A

Qwen3.5-27B

Alibaba Cloud / Qwen Team

61.9

$0.3 in / $2.4 out

32

GPT-5.1 Codex High

OpenAI

61.0

$1.25 in / $10 out

33
A

Qwen3.6-27B

Alibaba Cloud / Qwen Team

59.8

N/A

34
B

ERNIE 5.0

Baidu

59.7

N/A

35

Grok-3

xAI

59.5

$3 in / $15 out

36
A

Qwen3.5-397B-A17B

Alibaba Cloud / Qwen Team

58.6

$0.6 in / $3.6 out

37

DeepSeek-V3.2

DeepSeek

58.1

$0.26 in / $0.38 out

38
B

Seed 2.0 Lite

ByteDance

58.1

N/A

39

Grok 4 Fast

xAI

58.0

$0.2 in / $0.5 out

40

GPT-5.4 Mini

OpenAI

57.4

$0.75 in / $4.5 out