Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

309

Tracked models

27

Providers

264

Benchmarked

29.3

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

309 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
1

Claude Mythos Preview

claude-mythos-preview

multimodalvisionmulti-input reasoning
Anthropic

78.1

overall

80.00.070.284.20.0N/A
2

Grok-4 Heavy

grok-4-heavy

multimodalvisionmulti-input reasoning
xAI

72.0

overall

72.00.00.00.00.0
3

GPT-5.1 High

gpt-5.1-high-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

68.3

overall

68.30.00.00.00.0
4

GPT-5.5

gpt-5.5

multimodalvisionmulti-input reasoning
OpenAI

68.1

overall

80.493.770.261.61.9$5 in / $30 out
5

GPT-5.5 Pro

gpt-5.5-pro

multimodalvisionmulti-input reasoning
OpenAI

66.8

overall

67.80.071.860.10.0N/A
6

Grok-4.1 Fast Non-Reasoning

grok-4-1-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

66.6

overall

0.062.10.00.073.7
7

Grok-4.1 Fast Reasoning

grok-4-1-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

66.6

overall

0.062.10.00.073.7
8

Grok-4 Fast Non-Reasoning

grok-4-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

66.6

overall

0.062.10.00.073.7
9

Grok-4 Fast Reasoning

grok-4-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

66.6

overall

0.062.10.00.073.7
10

Qwen3.7 Max

qwen3.7-max

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

66.3

overall

66.172.261.781.535.4$1.25 in / $3.75 out
11

DeepSeek-V4-Pro-Max

deepseek-v4-pro-max

codeprogrammingtool use
DeepSeek

64.2

overall

67.489.261.358.634.2
12

Claude Opus 4.8

claude-opus-4-8

multimodalvisionmulti-input reasoning
Anthropic

64.0

overall

75.231.580.082.06.3
13

MiMo-V2-Pro

mimo-v2-pro

codeprogrammingtool use
Xiaomi

63.7

overall

0.00.00.063.70.0N/A
14

Gemini 3 Pro

gemini-3-pro-preview

multimodalvisionmulti-input reasoning
Google

63.2

overall

72.00.060.754.60.0
15

Gemini 3.1 Pro

gemini-3.1-pro-preview

multimodalvisionmulti-input reasoning
Google

63.1

overall

73.859.468.966.018.5
16

DeepSeek-V3.2 (Non-thinking)

deepseek-chat

textinference
DeepSeek

63.1

overall

0.053.00.00.079.3$0.28 in / $0.42 out
17

GPT-5 High

gpt-5-high-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

62.7

overall

62.70.00.00.00.0
18

Nova 2 Sonic

nova-2-sonic

multimodalvisionmulti-input reasoning
AAmazon

62.4

overall

0.072.20.00.046.8$0.33 in / $2.75 out
19

Gemini 3.1 Flash-Lite

gemini-3.1-flash-lite-preview

multimodalvisionmulti-input reasoning
Google

61.8

overall

55.372.20.00.063.3
20

DeepSeek-V4-Flash-Max

deepseek-v4-flash-max

codeprogrammingtool use
DeepSeek

61.6

overall

58.389.247.644.298.7
1

Claude Mythos Preview

Anthropic

78.1

N/A

2

Grok-4 Heavy

xAI

72.0

N/A

3

GPT-5.1 High

OpenAI

68.3

N/A

4

Page 1 of 16 · 309 models

Next

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

N/A
N/A
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$1.74 in / $3.48 out
$5 in / $25 out
N/A
$2.5 in / $15 out
N/A
$0.25 in / $1.5 out
$0.14 in / $0.28 out

GPT-5.5

OpenAI

68.1

$5 in / $30 out

5

GPT-5.5 Pro

OpenAI

66.8

N/A

6

Grok-4.1 Fast Non-Reasoning

xAI

66.6

$0.2 in / $0.5 out

7

Grok-4.1 Fast Reasoning

xAI

66.6

$0.2 in / $0.5 out

8

Grok-4 Fast Non-Reasoning

xAI

66.6

$0.2 in / $0.5 out

9

Grok-4 Fast Reasoning

xAI

66.6

$0.2 in / $0.5 out

10
A

Qwen3.7 Max

Alibaba Cloud / Qwen Team

66.3

$1.25 in / $3.75 out

11

DeepSeek-V4-Pro-Max

DeepSeek

64.2

$1.74 in / $3.48 out

12

Claude Opus 4.8

Anthropic

64.0

$5 in / $25 out

13

MiMo-V2-Pro

Xiaomi

63.7

N/A

14

Gemini 3 Pro

Google

63.2

N/A

15

Gemini 3.1 Pro

Google

63.1

$2.5 in / $15 out

16

DeepSeek-V3.2 (Non-thinking)

DeepSeek

63.1

$0.28 in / $0.42 out

17

GPT-5 High

OpenAI

62.7

N/A

18
A

Nova 2 Sonic

Amazon

62.4

$0.33 in / $2.75 out

19

Gemini 3.1 Flash-Lite

Google

61.8

$0.25 in / $1.5 out

20

DeepSeek-V4-Flash-Max

DeepSeek

61.6

$0.14 in / $0.28 out