Updated May 2, 2026

AI Power Rankings

Independent·Frontier AI lab index
Methodology · v1.1

How the Index is built

The Index measures structural AI power: a company's current frontier capability plus the ingredients that produce future capability. A panel of frontier evaluator models scores fifteen labs across ten areas on a 1.0–10.0 scale. The panel mean is the raw figure; a weighted profile is the published default.

1. Framing

A single ranking has to optimize for something. We deliberately reject two narrower framings:

  • Raw technological supremacy would over-index on Model Quality and ignore the ingredients that decide who's still leading in 18 months.
  • Long-term commercial dominance would reward current revenue and distribution at the expense of frontier capability — which is what actually moves this market.

Model Quality is the headline number; Research, Compute, Talent, and Data are the inputs that decide where that number goes next. Product, Distribution, and Business matter — but as multipliers on capability, not as substitutes for it.

2. The three tiers

Areas are grouped into three structural tiers totaling 100%.

Core Engine

56%

Foundational prerequisites for frontier AI. A company without strength here cannot compete at the highest level no matter how strong its commercial position.

AreaWeightWhy this weightDispute
🧠Model Quality18%The headline number. If the models aren't intelligent, nothing downstream matters. Slightly below 20% because Model Quality is a snapshot — R&I is the leading indicator that produces tomorrow's snapshot.Challenge →
🔬Research12%Promoted out of the accelerants bucket. Frontier breakthroughs (transformers, RLHF, MoE, reasoning) all came from research, and the labs that lead always lead by doing research first. Treating R&I as 5% rewards copying over originating.Challenge →
Compute14%Hard gate on what you can train. He who controls the GPUs/TPUs and energy controls the next generation of models.Challenge →
🏰Data & Moats12%Proprietary data, flywheels, and structural lock-in are major differentiators. Slightly below 15% because synthetic data, distillation, and shared web-scale pretraining have eroded the data moat faster than expected.Challenge →

Delivery

28%

Translating raw capability into reach and execution. Critical, but downstream of the Core Engine.

AreaWeightWhy this weightDispute
🏗️Product10%How well raw intelligence becomes usable, reliable tools and APIs. Execution at the product layer is where capability becomes power for users and developers.Challenge →
👥Talent10%Compounds everything else. Top-tier talent density and leadership's ability to execute and pivot decide whether a Core Engine advantage gets converted into product.Challenge →
🌐Distribution8%Existing OS / browser / cloud distribution is a real structural advantage. It's an amplifier, not a generator — which is why it sits below Talent and Product.Challenge →

Accelerants & Stabilizers

16%

Important multipliers and indicators, but lagging or narrative-driven rather than structurally decisive in the current AI arms race.

AreaWeightWhy this weightDispute
💰Business5%Current revenue matters less than expected — the industry tolerates massive capex without near-term profitability. Penalizing companies for this would distort the ranking.Challenge →
🛡️Safety6%Doesn't generate raw power today, but enterprise trust is starting to gate distribution. Slightly above 5% to reflect that emerging dynamic.Challenge →
🚀Momentum5%Captures the present narrative and shipping velocity. A snapshot, not a moat — narratives shift quickly.Challenge →

3. How the math works

Each company's overall score is a weighted average:

overall = Σ (area_score × area_weight) / 100

Worked example — a hypothetical company:

AreaScoreWeightContribution
Model Quality9.0181.62
Research & Innovation8.5121.02
Compute & Infrastructure8.0141.12
Data & Moats7.0120.84
Talent & Org9.0100.90
Product & Platform8.0100.80
Distribution & Reach6.080.48
Business & Market7.050.35
Momentum & Execution8.050.40
Safety & Alignment8.060.48
Weighted overall8.01

A 10.0 in every area still produces a 10.0 weighted overall — the ceiling is preserved.

4. The 1.0–10.0 rubric

Evaluators score each area on the same anchored scale:

9–10World-leading; clear #1 or #2 in this area
7–8Very strong; top tier competitor
5–6Competitive; solid but not differentiated
3–4Below average; notable gaps
1–2Minimal presence or capability in this area

5. Risk-adjusted (optional)

Safety carries a modest weight in the main ranking by design. To prevent companies with serious trust issues from ranking too highly purely on capability and distribution, an optional risk-adjusted overall can be computed alongside the main weighted total:

if safety_score < 5.0:  risk_adjusted = weighted_total × 0.95
if safety_score < 4.0:  risk_adjusted = weighted_total × 0.90
otherwise:              risk_adjusted = weighted_total

Exposed as a separate output, not the headline. Conflating power and risk into one number is a useful secondary view rather than the default.

6. How the weights were set

Weights were developed through structured back-and-forth between three frontier evaluator models, with a human arbitrating the major calls.

v1 (Claude + Gemini). Gemini proposed the 3-tier grouping. The notable disagreement was R&I: Gemini argued 5% (open research gets copied), Claude argued 12% (leaders originate, followers copy). 12% won; the structural framing was kept.

v1.1 (ChatGPT input). ChatGPT proposed a more output-weighted scheme (Business 12, Momentum 9, Talent 5). The framing dispute was resolved in favor of structural framing — output-weighting embeds incumbent bias. Three threads were absorbed: Talent 11→10, Product 9→10, and a Safety risk modifier instead of fighting over Safety's headline weight.

Grok and additional evaluator models were not consulted for v1.1.

7. Open questions

  • R&I vs. Model Quality overlap. Both reward capability. If evaluators flag this as double-counting, we may rebalance.
  • Safety drift. If enterprise trust becomes a stronger gate on distribution, Safety may need to move up.
  • Compute commoditization. If access becomes meaningfully more even, Compute may need to come down.
  • Annual re-evaluation. The industry shifts fast enough that 2026 weights may not fit 2027.