Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

34.7

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
41

Kimi K2.5

kimi-k2.5

multimodalvisionmulti-input reasoning
Moonshot AI

56.1

overall

68.066.849.548.538.1$0.6 in / $3 out
42

GPT-5.1 Thinking

gpt-5.1-thinking-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

55.7

overall

65.055.10.057.227.0
43

GLM-5.1

glm-5.1

codeprogrammingtool use
ZZhipu AI

55.2

overall

67.146.654.458.330.6$1.4 in / $4.4 out
44

Grok-3 Mini

grok-3-mini

multimodalvisionmulti-input reasoning
xAI

55.1

overall

53.451.90.00.065.0$0.3 in / $0.5 out
45

GLM-5V-Turbo

glm-5v-turbo

multimodalvisionmulti-input reasoning
ZZhipu AI

54.9

overall

0.00.054.90.00.0N/A
46

Claude Sonnet 4.5

claude-sonnet-4-5-20250929

multimodalvisionmulti-input reasoning
Anthropic

54.7

overall

53.330.171.874.613.2
47

Seed 2.0 Lite

seed-2.0-lite

multimodalvisionmulti-input reasoning
BByteDance

54.6

overall

58.10.00.050.30.0N/A
48

MiMo-V2-Omni

mimo-v2-omni

multimodalvisionmulti-input reasoning
Xiaomi

54.5

overall

0.059.20.055.644.7$0.4 in / $2 out
49

MiniMax M2.1

minimax-m2.1

codeprogrammingtool use
MiniMax

54.3

overall

42.773.956.650.657.7$0.3 in / $1.2 out
50

GPT-5 Codex

gpt-5-codex-2025-09-15

codeprogrammingtool use
OpenAI

54.3

overall

0.00.00.054.30.0N/A
51

Qwen3.5-122B-A10B

qwen3.5-122b-a10b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

54.1

overall

64.866.851.641.538.1$0.4 in / $3.2 out
52

ChatGPT-4o Latest

chatgpt-4o-latest

multimodalvisionmulti-input reasoning
OpenAI

54.1

overall

56.663.50.00.032.0
53

GPT OSS 20B High

gpt-oss-20b-high

textinference
OpenAI

53.9

overall

53.90.00.00.00.0N/A
54

GPT OSS 120B High

gpt-oss-120b-high

multimodalvisionmulti-input reasoning
OpenAI

53.8

overall

44.957.30.00.073.2
55

Qwen3-235B-A22B-Instruct-2507

qwen3-235b-a22b-instruct-2507

textinference
AAlibaba Cloud / Qwen Team

53.6

overall

42.966.80.00.062.8$0.15 in / $0.8 out
56

Qwen3.5-27B

qwen3.5-27b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

53.1

overall

61.966.847.542.443.9$0.3 in / $2.4 out
57

Qwen3.6-27B

qwen3.6-27b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

53.1

overall

59.80.00.044.60.0N/A
58

GPT-5 Medium

gpt-5-medium-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

53.1

overall

56.961.60.00.029.0
59

Min istral 3 (3B Reasoning 2512)

ministral-3b-latest

multimodalvisionmulti-input reasoning
Mistral AI

52.8

overall

22.179.70.00.095.8
60

Qwen3.5-397B-A17B

qwen3.5-397b-a17b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

52.6

overall

58.666.835.660.935.3$0.6 in / $3.6 out
41

Kimi K2.5

Moonshot AI

56.1

$0.6 in / $3 out

42

GPT-5.1 Thinking

OpenAI

55.7

$1.25 in / $10 out

43
Z

GLM-5.1

Zhipu AI

55.2

$1.4 in / $4.4 out

44

Page 3 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$1.25 in / $10 out
$3 in / $15 out
$2.5 in / $10 out
$0.1 in / $0.5 out
$1.25 in / $10 out
$0.1 in / $0.1 out

Grok-3 Mini

xAI

55.1

$0.3 in / $0.5 out

45
Z

GLM-5V-Turbo

Zhipu AI

54.9

N/A

46

Claude Sonnet 4.5

Anthropic

54.7

$3 in / $15 out

47
B

Seed 2.0 Lite

ByteDance

54.6

N/A

48

MiMo-V2-Omni

Xiaomi

54.5

$0.4 in / $2 out

49

MiniMax M2.1

MiniMax

54.3

$0.3 in / $1.2 out

50

GPT-5 Codex

OpenAI

54.3

N/A

51
A

Qwen3.5-122B-A10B

Alibaba Cloud / Qwen Team

54.1

$0.4 in / $3.2 out

52

ChatGPT-4o Latest

OpenAI

54.1

$2.5 in / $10 out

53

GPT OSS 20B High

OpenAI

53.9

N/A

54

GPT OSS 120B High

OpenAI

53.8

$0.1 in / $0.5 out

55
A

Qwen3-235B-A22B-Instruct-2507

Alibaba Cloud / Qwen Team

53.6

$0.15 in / $0.8 out

56
A

Qwen3.5-27B

Alibaba Cloud / Qwen Team

53.1

$0.3 in / $2.4 out

57
A

Qwen3.6-27B

Alibaba Cloud / Qwen Team

53.1

N/A

58

GPT-5 Medium

OpenAI

53.1

$1.25 in / $10 out

59

Min istral 3 (3B Reasoning 2512)

Mistral AI

52.8

$0.1 in / $0.1 out

60
A

Qwen3.5-397B-A17B

Alibaba Cloud / Qwen Team

52.6

$0.6 in / $3.6 out