Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

13.2

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
1

Claude Mythos Preview

claude-mythos-preview

multimodalvisionmulti-input reasoning
Anthropic

84.2

Programming

80.00.070.084.21.7$25 in / $125 out
2

Claude Opus 4.7

claude-opus-4-7

multimodalvisionmulti-input reasoning
Anthropic

81.2

Programming

76.842.869.281.210.6
3

Kimi K2.6

kimi-k2.6

multimodalvisionmulti-input reasoning
Moonshot AI

81.0

Programming

68.566.845.381.033.3
4

Claude Sonnet 4.5

claude-sonnet-4-5-20250929

multimodalvisionmulti-input reasoning
Anthropic

74.6

Programming

53.330.171.874.613.2
5

Claude Opus 4.5

claude-opus-4-5-20251101

multimodalvisionmulti-input reasoning
Anthropic

74.2

Programming

56.330.144.274.210.6
6

Claude Opus 4.6

claude-opus-4-6

multimodalvisionmulti-input reasoning
Anthropic

73.3

Programming

79.542.860.773.310.6
7

GPT-5.2

gpt-5.2-2025-12-11

multimodalvisionmulti-input reasoning
OpenAI

72.4

Programming

76.971.450.372.426.4
8

Claude Sonnet 4.6

claude-sonnet-4-6

multimodalvisionmulti-input reasoning
Anthropic

68.9

Programming

66.130.149.668.913.2
9

Gemini 3 Flash

gemini-3-flash-preview

multimodalvisionmulti-input reasoning
Google

66.6

Programming

71.384.942.566.638.9
10

MiMo-V2-Pro

mimo-v2-pro

codeprogrammingtool use
Xiaomi

66.6

Programming

0.084.90.066.636.4$1 in / $3 out
11

Gemini 3.1 Pro

gemini-3.1-pro-preview

multimodalvisionmulti-input reasoning
Google

65.5

Programming

74.366.872.365.522.1
12

GPT-5.5

gpt-5.5

multimodalvisionmulti-input reasoning
OpenAI

65.4

Programming

80.384.976.265.46.7$5 in / $30 out
13

GLM-5

glm-5

codeprogrammingtool use
ZZhipu AI

65.3

Programming

0.022.151.365.330.2$1 in / $3.2 out
14

Claude Opus 4.1

claude-opus-4-1-20250805

multimodalvisionmulti-input reasoning
Anthropic

62.9

Programming

48.130.166.862.97.0
15

Kimi K2-Thinking-0905

kimi-k2-thinking-0905

codeprogrammingtool use
Moonshot AI

62.5

Programming

69.30.053.562.50.0
16

Qwen3.6 Plus

qwen3.6-plus

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

62.2

Programming

71.90.049.362.20.0N/A
17

GPT-5.4

gpt-5.4

texttext-to-textlanguage
OpenAI

62.1

Programming

76.351.163.862.118.2
18

Seed 2.0 Pro

seed-2.0-pro

multimodalvisionmulti-input reasoning
BByteDance

61.8

Programming

68.20.054.761.80.0N/A
19

Qwen3.5-397B-A17B

qwen3.5-397b-a17b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

60.9

Programming

58.666.835.660.935.3$0.6 in / $3.6 out
20

GPT-5.5 Pro

gpt-5.5-pro

multimodalvisionmulti-input reasoning
OpenAI

59.1

Programming

67.884.971.859.10.6$30 in / $180 out
1

Claude Mythos Preview

Anthropic

84.2

$25 in / $125 out

2

Claude Opus 4.7

Anthropic

81.2

$5 in / $25 out

3

Kimi K2.6

Moonshot AI

81.0

$0.95 in / $4 out

Page 1 of 15 · 294 models

Next

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$5 in / $25 out
$0.95 in / $4 out
$3 in / $15 out
$5 in / $25 out
$5 in / $25 out
$1.75 in / $14 out
$3 in / $15 out
$0.5 in / $3 out
$2.5 in / $15 out
$15 in / $75 out
N/A
$2.5 in / $15 out
4

Claude Sonnet 4.5

Anthropic

74.6

$3 in / $15 out

5

Claude Opus 4.5

Anthropic

74.2

$5 in / $25 out

6

Claude Opus 4.6

Anthropic

73.3

$5 in / $25 out

7

GPT-5.2

OpenAI

72.4

$1.75 in / $14 out

8

Claude Sonnet 4.6

Anthropic

68.9

$3 in / $15 out

9

Gemini 3 Flash

Google

66.6

$0.5 in / $3 out

10

MiMo-V2-Pro

Xiaomi

66.6

$1 in / $3 out

11

Gemini 3.1 Pro

Google

65.5

$2.5 in / $15 out

12

GPT-5.5

OpenAI

65.4

$5 in / $30 out

13
Z

GLM-5

Zhipu AI

65.3

$1 in / $3.2 out

14

Claude Opus 4.1

Anthropic

62.9

$15 in / $75 out

15

Kimi K2-Thinking-0905

Moonshot AI

62.5

N/A

16
A

Qwen3.6 Plus

Alibaba Cloud / Qwen Team

62.2

N/A

17

GPT-5.4

OpenAI

62.1

$2.5 in / $15 out

18
B

Seed 2.0 Pro

ByteDance

61.8

N/A

19
A

Qwen3.5-397B-A17B

Alibaba Cloud / Qwen Team

60.9

$0.6 in / $3.6 out

20

GPT-5.5 Pro

OpenAI

59.1

$30 in / $180 out