Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

11.4

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
1

GPT-5.5

gpt-5.5

multimodalvisionmulti-input reasoning
OpenAI

76.2

Agentic

80.384.976.265.46.7$5 in / $30 out
2

Gemini 3.1 Pro

gemini-3.1-pro-preview

multimodalvisionmulti-input reasoning
Google

72.3

Agentic

74.366.872.365.522.1
3

Claude Sonnet 4.5

claude-sonnet-4-5-20250929

multimodalvisionmulti-input reasoning
Anthropic

71.8

Agentic

53.330.171.874.613.2
4

GPT-5.5 Pro

gpt-5.5-pro

multimodalvisionmulti-input reasoning
OpenAI

71.8

Agentic

67.884.971.859.10.6$30 in / $180 out
5

Claude Mythos Preview

claude-mythos-preview

multimodalvisionmulti-input reasoning
Anthropic

70.0

Agentic

80.00.070.084.21.7
6

Claude Opus 4.7

claude-opus-4-7

multimodalvisionmulti-input reasoning
Anthropic

69.2

Agentic

76.842.869.281.210.6
7

Muse Spark

muse-spark

multimodalvisionmulti-input reasoning
MMeta

67.3

Agentic

71.00.067.341.30.0N/A
8

Claude Opus 4.1

claude-opus-4-1-20250805

multimodalvisionmulti-input reasoning
Anthropic

66.8

Agentic

48.130.166.862.97.0
9

Gemini 3 Pro

gemini-3-pro-preview

multimodalvisionmulti-input reasoning
Google

63.8

Agentic

73.30.063.857.40.0
10

GPT-5.4

gpt-5.4

texttext-to-textlanguage
OpenAI

63.8

Agentic

76.351.163.862.118.2
11

Claude Opus 4.6

claude-opus-4-6

multimodalvisionmulti-input reasoning
Anthropic

60.7

Agentic

79.542.860.773.310.6
12

Claude Opus 4

claude-opus-4-20250514

multimodalvisionmulti-input reasoning
Anthropic

57.9

Agentic

37.80.057.949.50.0
13

Qwen3 VL 235B A22B Instruct

qwen3-vl-235b-a22b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

56.7

Agentic

37.166.856.70.049.4
14

MiniMax M2.1

minimax-m2.1

codeprogrammingtool use
MiniMax

56.6

Agentic

42.773.956.650.657.7$0.3 in / $1.2 out
15

GPT-5.2 Pro

gpt-5.2-pro-2025-12-11

multimodalvisionmulti-input reasoning
OpenAI

56.4

Agentic

67.331.356.40.02.5
16

GLM-5V-Turbo

glm-5v-turbo

multimodalvisionmulti-input reasoning
ZZhipu AI

54.9

Agentic

0.00.054.90.00.0N/A
17

Seed 2.0 Pro

seed-2.0-pro

multimodalvisionmulti-input reasoning
BByteDance

54.7

Agentic

68.20.054.761.80.0N/A
18

GLM-5.1

glm-5.1

codeprogrammingtool use
ZZhipu AI

54.4

Agentic

67.146.654.458.330.6$1.4 in / $4.4 out
19

Claude Haiku 4.5

claude-haiku-4-5-20251001

multimodalvisionmulti-input reasoning
Anthropic

54.2

Agentic

32.961.254.257.237.7
20

Kimi K2-Thinking-0905

kimi-k2-thinking-0905

codeprogrammingtool use
Moonshot AI

53.5

Agentic

69.30.053.562.50.0
1

GPT-5.5

OpenAI

76.2

$5 in / $30 out

2

Gemini 3.1 Pro

Google

72.3

$2.5 in / $15 out

3

Claude Sonnet 4.5

Anthropic

71.8

$3 in / $15 out

Page 1 of 15 · 294 models

Next

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$2.5 in / $15 out
$3 in / $15 out
$25 in / $125 out
$5 in / $25 out
$15 in / $75 out
N/A
$2.5 in / $15 out
$5 in / $25 out
N/A
$0.3 in / $1.5 out
$21 in / $168 out
$1 in / $5 out
N/A
4

GPT-5.5 Pro

OpenAI

71.8

$30 in / $180 out

5

Claude Mythos Preview

Anthropic

70.0

$25 in / $125 out

6

Claude Opus 4.7

Anthropic

69.2

$5 in / $25 out

7
M

Muse Spark

Meta

67.3

N/A

8

Claude Opus 4.1

Anthropic

66.8

$15 in / $75 out

9

Gemini 3 Pro

Google

63.8

N/A

10

GPT-5.4

OpenAI

63.8

$2.5 in / $15 out

11

Claude Opus 4.6

Anthropic

60.7

$5 in / $25 out

12

Claude Opus 4

Anthropic

57.9

N/A

13
A

Qwen3 VL 235B A22B Instruct

Alibaba Cloud / Qwen Team

56.7

$0.3 in / $1.5 out

14

MiniMax M2.1

MiniMax

56.6

$0.3 in / $1.2 out

15

GPT-5.2 Pro

OpenAI

56.4

$21 in / $168 out

16
Z

GLM-5V-Turbo

Zhipu AI

54.9

N/A

17
B

Seed 2.0 Pro

ByteDance

54.7

N/A

18
Z

GLM-5.1

Zhipu AI

54.4

$1.4 in / $4.4 out

19

Claude Haiku 4.5

Anthropic

54.2

$1 in / $5 out

20

Kimi K2-Thinking-0905

Moonshot AI

53.5

N/A