Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

27.4

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
1

GPT-5.5

gpt-5.5

multimodalvisionmulti-input reasoning
OpenAI

80.3

Benchmarks

80.384.976.265.46.7$5 in / $30 out
2

Claude Mythos Preview

claude-mythos-preview

multimodalvisionmulti-input reasoning
Anthropic

80.0

Benchmarks

80.00.070.084.21.7
3

Claude Opus 4.6

claude-opus-4-6

multimodalvisionmulti-input reasoning
Anthropic

79.5

Benchmarks

79.542.860.773.310.6
4

GPT-5.2

gpt-5.2-2025-12-11

multimodalvisionmulti-input reasoning
OpenAI

76.9

Benchmarks

76.971.450.372.426.4
5

Claude Opus 4.7

claude-opus-4-7

multimodalvisionmulti-input reasoning
Anthropic

76.8

Benchmarks

76.842.869.281.210.6
6

GPT-5.4

gpt-5.4

texttext-to-textlanguage
OpenAI

76.3

Benchmarks

76.351.163.862.118.2
7

Gemini 3.1 Pro

gemini-3.1-pro-preview

multimodalvisionmulti-input reasoning
Google

74.3

Benchmarks

74.366.872.365.522.1
8

Gemini 3 Pro

gemini-3-pro-preview

multimodalvisionmulti-input reasoning
Google

73.3

Benchmarks

73.30.063.857.40.0
9

Grok-4 Heavy

grok-4-heavy

multimodalvisionmulti-input reasoning
xAI

72.4

Benchmarks

72.40.00.00.00.0N/A
10

Qwen3.6 Plus

qwen3.6-plus

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

71.9

Benchmarks

71.90.049.362.20.0N/A
11

Gemini 3 Flash

gemini-3-flash-preview

multimodalvisionmulti-input reasoning
Google

71.3

Benchmarks

71.384.942.566.638.9
12

Muse Spark

muse-spark

multimodalvisionmulti-input reasoning
MMeta

71.0

Benchmarks

71.00.067.341.30.0N/A
13

Kimi K2-Thinking-0905

kimi-k2-thinking-0905

codeprogrammingtool use
Moonshot AI

69.3

Benchmarks

69.30.053.562.50.0
14

GPT-5.1 High

gpt-5.1-high-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

68.7

Benchmarks

68.70.00.00.00.0
15

Kimi K2.6

kimi-k2.6

multimodalvisionmulti-input reasoning
Moonshot AI

68.5

Benchmarks

68.566.845.381.033.3
16

Seed 2.0 Pro

seed-2.0-pro

multimodalvisionmulti-input reasoning
BByteDance

68.2

Benchmarks

68.20.054.761.80.0N/A
17

Kimi K2.5

kimi-k2.5

multimodalvisionmulti-input reasoning
Moonshot AI

68.0

Benchmarks

68.066.849.548.538.1
18

GPT-5.5 Pro

gpt-5.5-pro

multimodalvisionmulti-input reasoning
OpenAI

67.8

Benchmarks

67.884.971.859.10.6$30 in / $180 out
19

GPT-5.2 Pro

gpt-5.2-pro-2025-12-11

multimodalvisionmulti-input reasoning
OpenAI

67.3

Benchmarks

67.331.356.40.02.5
20

GLM-5.1

glm-5.1

codeprogrammingtool use
ZZhipu AI

67.1

Benchmarks

67.146.654.458.330.6$1.4 in / $4.4 out
1

GPT-5.5

OpenAI

80.3

$5 in / $30 out

2

Claude Mythos Preview

Anthropic

80.0

$25 in / $125 out

3

Claude Opus 4.6

Anthropic

79.5

$5 in / $25 out

Page 1 of 15 · 294 models

Next

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$25 in / $125 out
$5 in / $25 out
$1.75 in / $14 out
$5 in / $25 out
$2.5 in / $15 out
$2.5 in / $15 out
N/A
$0.5 in / $3 out
N/A
N/A
$0.95 in / $4 out
$0.6 in / $3 out
$21 in / $168 out
4

GPT-5.2

OpenAI

76.9

$1.75 in / $14 out

5

Claude Opus 4.7

Anthropic

76.8

$5 in / $25 out

6

GPT-5.4

OpenAI

76.3

$2.5 in / $15 out

7

Gemini 3.1 Pro

Google

74.3

$2.5 in / $15 out

8

Gemini 3 Pro

Google

73.3

N/A

9

Grok-4 Heavy

xAI

72.4

N/A

10
A

Qwen3.6 Plus

Alibaba Cloud / Qwen Team

71.9

N/A

11

Gemini 3 Flash

Google

71.3

$0.5 in / $3 out

12
M

Muse Spark

Meta

71.0

N/A

13

Kimi K2-Thinking-0905

Moonshot AI

69.3

N/A

14

GPT-5.1 High

OpenAI

68.7

N/A

15

Kimi K2.6

Moonshot AI

68.5

$0.95 in / $4 out

16
B

Seed 2.0 Pro

ByteDance

68.2

N/A

17

Kimi K2.5

Moonshot AI

68.0

$0.6 in / $3 out

18

GPT-5.5 Pro

OpenAI

67.8

$30 in / $180 out

19

GPT-5.2 Pro

OpenAI

67.3

$21 in / $168 out

20
Z

GLM-5.1

Zhipu AI

67.1

$1.4 in / $4.4 out