Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

309

Tracked models

27

Providers

264

Benchmarked

11.8

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

309 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
1

Claude Opus 4.8

claude-opus-4-8

multimodalvisionmulti-input reasoning
Anthropic

80.0

Agentic

75.231.580.082.06.3$5 in / $25 out
2

Gemini 3.5 Flash

gemini-3.5-flash

multimodalvisionmulti-input reasoning
Google

74.4

Agentic

62.889.274.430.526.6
3

Claude Sonnet 4.5

claude-sonnet-4-5-20250929

multimodalvisionmulti-input reasoning
Anthropic

71.8

Agentic

51.914.671.874.69.3
4

GPT-5.5 Pro

gpt-5.5-pro

multimodalvisionmulti-input reasoning
OpenAI

71.8

Agentic

67.80.071.860.10.0N/A
5

Claude Mythos Preview

claude-mythos-preview

multimodalvisionmulti-input reasoning
Anthropic

70.2

Agentic

80.00.070.284.20.0
6

GPT-5.5

gpt-5.5

multimodalvisionmulti-input reasoning
OpenAI

70.2

Agentic

80.493.770.261.61.9$5 in / $30 out
7

Gemini 3.1 Pro

gemini-3.1-pro-preview

multimodalvisionmulti-input reasoning
Google

68.9

Agentic

73.859.468.966.018.5
8

Claude Opus 4.1

claude-opus-4-1-20250805

multimodalvisionmulti-input reasoning
Anthropic

67.4

Agentic

46.40.067.462.00.0
9

Muse Spark

muse-spark

multimodalvisionmulti-input reasoning
MMeta

64.1

Agentic

69.90.064.139.10.0N/A
10

Claude Opus 4.7

claude-opus-4-7

multimodalvisionmulti-input reasoning
Anthropic

63.8

Agentic

76.631.563.879.96.3
11

Qwen3.7 Max

qwen3.7-max

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

61.7

Agentic

66.172.261.781.535.4$1.25 in / $3.75 out
12

DeepSeek-V4-Pro-Max

deepseek-v4-pro-max

codeprogrammingtool use
DeepSeek

61.3

Agentic

67.489.261.358.634.2
13

Gemini 3 Pro

gemini-3-pro-preview

multimodalvisionmulti-input reasoning
Google

60.7

Agentic

72.00.060.754.60.0
14

Claude Opus 4.6

claude-opus-4-6

multimodalvisionmulti-input reasoning
Anthropic

57.8

Agentic

78.231.557.872.86.3
15

Kimi K2.6

kimi-k2.6

texttext-to-textlanguage
Moonshot AI

57.6

Agentic

67.041.157.675.436.7
16

Claude Opus 4

claude-opus-4-20250514

multimodalvisionmulti-input reasoning
Anthropic

57.4

Agentic

36.40.057.447.80.0
17

Nova 2 Pro

nova-2-pro

multimodalvisionmulti-input reasoning
AAmazon

57.2

Agentic

46.80.057.250.60.0N/A
18

GPT-5.4

gpt-5.4

texttext-to-textlanguage
OpenAI

56.2

Agentic

75.338.956.260.614.1
19

Qwen3 VL 235B A22B Instruct

qwen3-vl-235b-a22b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

55.8

Agentic

36.141.155.80.062.0
20

GLM-5V-Turbo

glm-5v-turbo

multimodalvisionmulti-input reasoning
ZZhipu AI

54.9

Agentic

0.00.054.90.00.0N/A
1

Claude Opus 4.8

Anthropic

80.0

$5 in / $25 out

2

Gemini 3.5 Flash

Google

74.4

$1.5 in / $9 out

3

Claude Sonnet 4.5

Anthropic

71.8

$3 in / $15 out

Page 1 of 16 · 309 models

Next

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$1.5 in / $9 out
$3 in / $15 out
N/A
$2.5 in / $15 out
N/A
$5 in / $25 out
$1.74 in / $3.48 out
N/A
$5 in / $25 out
$0.95 in / $4 out
N/A
$2.5 in / $15 out
$0.3 in / $1.49 out
4

GPT-5.5 Pro

OpenAI

71.8

N/A

5

Claude Mythos Preview

Anthropic

70.2

N/A

6

GPT-5.5

OpenAI

70.2

$5 in / $30 out

7

Gemini 3.1 Pro

Google

68.9

$2.5 in / $15 out

8

Claude Opus 4.1

Anthropic

67.4

N/A

9
M

Muse Spark

Meta

64.1

N/A

10

Claude Opus 4.7

Anthropic

63.8

$5 in / $25 out

11
A

Qwen3.7 Max

Alibaba Cloud / Qwen Team

61.7

$1.25 in / $3.75 out

12

DeepSeek-V4-Pro-Max

DeepSeek

61.3

$1.74 in / $3.48 out

13

Gemini 3 Pro

Google

60.7

N/A

14

Claude Opus 4.6

Anthropic

57.8

$5 in / $25 out

15

Kimi K2.6

Moonshot AI

57.6

$0.95 in / $4 out

16

Claude Opus 4

Anthropic

57.4

N/A

17
A

Nova 2 Pro

Amazon

57.2

N/A

18

GPT-5.4

OpenAI

56.2

$2.5 in / $15 out

19
A

Qwen3 VL 235B A22B Instruct

Alibaba Cloud / Qwen Team

55.8

$0.3 in / $1.49 out

20
Z

GLM-5V-Turbo

Zhipu AI

54.9

N/A