Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

31.8

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
1

Grok-4.20 Beta Non-Reasoning

grok-4.20-beta-0309-non-reasoning

multimodalvisionmulti-input reasoning
xAI

97.2

Inference

0.097.20.00.028.1$2 in / $6 out
2

Grok-4.20 Beta Reasoning

grok-4.20-beta-0309-reasoning

multimodalvisionmulti-input reasoning
xAI

97.2

Inference

0.097.20.00.028.1
3

Gemini 2.0 Flash

gemini-2.0-flash

multimodalvisionmulti-input reasoning
Google

94.1

Inference

33.494.10.00.082.7
4

GPT-4.1 nano

gpt-4.1-nano-2025-04-14

multimodalvisionmulti-input reasoning
OpenAI

93.7

Inference

12.693.70.00.083.0
5

Llama 4 Scout

llama-4-scout

multimodalvisionmulti-input reasoning
MMeta

93.0

Inference

29.293.00.00.087.2$0.08 in / $0.3 out
6

Gemini 1.5 Flash

gemini-1.5-flash

multimodalvisionmulti-input reasoning
Google

92.1

Inference

23.292.10.00.071.9
7

Gemini 1.5 Flash 8B

gemini-1.5-flash-8b

multimodalvisionmulti-input reasoning
Google

92.1

Inference

10.492.10.00.088.3
8

GPT-4.1 mini

gpt-4.1-mini-2025-04-14

multimodalvisionmulti-input reasoning
OpenAI

90.9

Inference

20.890.98.92.656.8
9

GPT-5 mini

gpt-5-mini-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

89.7

Inference

41.989.70.023.756.3
10

Gemini 3.1 Flash-Lite

gemini-3.1-flash-lite-preview

multimodalvisionmulti-input reasoning
Google

84.9

Inference

56.384.90.00.050.6
11

Gemini 3 Flash

gemini-3-flash-preview

multimodalvisionmulti-input reasoning
Google

84.9

Inference

71.384.942.566.638.9
12

GPT-5.5

gpt-5.5

multimodalvisionmulti-input reasoning
OpenAI

84.9

Inference

80.384.976.265.46.7$5 in / $30 out
13

GPT-5.5 Pro

gpt-5.5-pro

multimodalvisionmulti-input reasoning
OpenAI

84.9

Inference

67.884.971.859.10.6$30 in / $180 out
14

MiMo-V2-Pro

mimo-v2-pro

codeprogrammingtool use
Xiaomi

84.9

Inference

0.084.90.066.636.4$1 in / $3 out
15

MiniMax M1 80K

minimax-m1-80k

codeprogrammingtool use
MiniMax

84.9

Inference

24.684.920.919.441.7$0.55 in / $2.2 out
16

Ministral 3 (8B Reasoning 2512)

ministral-8b-latest

multimodalvisionmulti-input reasoning
Mistral AI

84.8

Inference

31.884.80.00.092.1
17

LongCat-Flash-Lite

longcat-flash-lite

codeprogrammingtool use
Meituan

83.8

Inference

24.783.829.525.383.3$0.1 in / $0.4 out
18

MiMo-V2-Flash

mimo-v2-flash

codeprogrammingtool use
Xiaomi

79.8

Inference

53.779.827.239.385.9$0.1 in / $0.3 out
19

Min istral 3 (3B Reasoning 2512)

ministral-3b-latest

multimodalvisionmulti-input reasoning
Mistral AI

79.7

Inference

22.179.70.00.095.8
20

GPT-5.4 Mini

gpt-5.4-mini

texttext-to-textlanguage
OpenAI

77.4

Inference

57.477.427.126.932.8
1

Grok-4.20 Beta Non-Reasoning

xAI

97.2

$2 in / $6 out

2

Grok-4.20 Beta Reasoning

xAI

97.2

$2 in / $6 out

3

Gemini 2.0 Flash

Google

94.1

$0.1 in / $0.4 out

Page 1 of 15 · 294 models

Next

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$2 in / $6 out
$0.1 in / $0.4 out
$0.1 in / $0.4 out
$0.15 in / $0.6 out
$0.07 in / $0.3 out
$0.4 in / $1.6 out
$0.25 in / $2 out
$0.25 in / $1.5 out
$0.5 in / $3 out
$0.15 in / $0.15 out
$0.1 in / $0.1 out
$0.75 in / $4.5 out
4

GPT-4.1 nano

OpenAI

93.7

$0.1 in / $0.4 out

5
M

Llama 4 Scout

Meta

93.0

$0.08 in / $0.3 out

6

Gemini 1.5 Flash

Google

92.1

$0.15 in / $0.6 out

7

Gemini 1.5 Flash 8B

Google

92.1

$0.07 in / $0.3 out

8

GPT-4.1 mini

OpenAI

90.9

$0.4 in / $1.6 out

9

GPT-5 mini

OpenAI

89.7

$0.25 in / $2 out

10

Gemini 3.1 Flash-Lite

Google

84.9

$0.25 in / $1.5 out

11

Gemini 3 Flash

Google

84.9

$0.5 in / $3 out

12

GPT-5.5

OpenAI

84.9

$5 in / $30 out

13

GPT-5.5 Pro

OpenAI

84.9

$30 in / $180 out

14

MiMo-V2-Pro

Xiaomi

84.9

$1 in / $3 out

15

MiniMax M1 80K

MiniMax

84.9

$0.55 in / $2.2 out

16

Ministral 3 (8B Reasoning 2512)

Mistral AI

84.8

$0.15 in / $0.15 out

17

LongCat-Flash-Lite

Meituan

83.8

$0.1 in / $0.4 out

18

MiMo-V2-Flash

Xiaomi

79.8

$0.1 in / $0.3 out

19

Min istral 3 (3B Reasoning 2512)

Mistral AI

79.7

$0.1 in / $0.1 out

20

GPT-5.4 Mini

OpenAI

77.4

$0.75 in / $4.5 out