A data benchmark is a promise. The promise is that the number published this week was measured the same way as the number published last week, will be measured the same way as the number published next week, and can be verified by anyone willing to look at the sources. This page describes exactly how that promise is kept — and where its limits are.
The commitments.
Before any methodology detail, four commitments define how the Compute Price Index is produced. These commitments are not aspirational. They are the standard against which every published Index issue can be audited.
The panel.
The Compute Price Index is calculated from a panel of nineteen GPU cloud providers selected to represent the most economically significant participants in the public-rate AI compute market. The panel is organized into five tiers, each capturing a structurally distinct segment of the market.
Sixteen providers (Tiers 1 through 4) publish on-demand GPU rental rates in dollars-per-GPU-hour. These are measured directly. Three providers (Tier 5) price their service in dollars-per-million-tokens for inference workloads. For these, the Index publishes a derived per-GPU-hour equivalent using a stated throughput assumption. The two categories are kept methodologically distinct in every issue.
The direct rental procedure.
For Tiers 1 through 4, the Compute Price Index is produced through a five-step weekly procedure. Each step is auditable — a subscriber who wants to verify the methodology can reproduce it.
Step one — Hardware definition
The Index tracks five specific hardware configurations, not general product categories. H100 SXM means the Nvidia H100 SXM 80GB variant. H100 PCIe means the Nvidia H100 PCIe 80GB variant. H200 means the Nvidia H200 SXM 141GB variant. B200 means the Nvidia B200 SXM variant. A100 80GB means the Nvidia A100 80GB SXM variant. Where providers offer multiple regional or networking sub-variants, we capture the most representative configuration and disclose the choice in coverage notes.
Step two — Rate capture
Every Monday, each tracked provider's public pricing page is reviewed. For each tracked hardware configuration, the published on-demand hourly rate is recorded. The data point captured is always the same: per-GPU, per-hour, on-demand, US region, single-instance, without committed-use discount applied. The provider URL and capture timestamp are logged alongside every rate.
Step three — Normalization
Providers publish pricing in different formats — per-GPU, per-node (commonly 8 GPUs), or per-instance. The Index normalizes all rates to a per-GPU-per-hour basis. Where a provider offers only node-level pricing, the node rate is divided by the GPU count in the node, and this conversion is documented. Where a provider bundles storage or networking into the GPU rate, the bundled rate is used and the bundling is disclosed.
Step four — Calculation
The Index value for each hardware configuration is the median rate across the providers reporting that hardware in that week. Median is used rather than mean because the publicly-quoted market includes a substantial spread between hyperscalers and aggregators, and mean would be distorted by the spread. The observed range — minimum and maximum across the panel — is reported alongside the median, along with the count of providers reporting that hardware in the week.
Step five — Publication
The weekly Index publishes the median, range, provider count, week-over-week delta, and year-over-year delta (once sufficient historical data exists) for each tracked hardware configuration. Every issue includes the full panel composition for that week, any panel changes from the prior week, and any methodology notes relevant to interpretation.
The inference-implied procedure.
For Tier 5 — the three token-priced inference providers — the Index publishes a derived per-GPU-hour rate. The derivation is mechanical and transparent.
Where token price is the published price per million tokens, throughput is the assumed tokens-per-second rate for the reference model on the reference hardware, and 3,600 ÷ 1,000,000 converts seconds-per-hour into hours-per-million-tokens.
The reference model and throughput assumption are stated values that we publish and can update. As of launch, the reference model is Llama 3.1 70B in FP16 precision, and the throughput assumption is 1,000 tokens per second on H100 SXM — values consistent with widely-published inference benchmarks. If we change either, we disclose the change in the issue it takes effect.
Token prices typically differ between input tokens (prompts) and output tokens (completions). The Index uses a blended rate — a simple average of input and output prices — as the representative token price. Real-world workloads vary in their input/output ratio; the blended rate is a reference point, not a workload-specific calculation. This is disclosed in every issue containing inference-implied data.
The normalization rules.
These rules determine whether any given rate observation is valid, comparable, and includable in the Index. They apply to every Tier 1 through 4 capture without exception.
| Rule | Definition |
|---|---|
| On-demand only | Rates reflect the hourly on-demand tier as published. Committed-use, reserved, spot, and interruptible rates are not included in the calculation. |
| Per-GPU basis | All rates normalized to per-GPU-per-hour. Node-level rates are divided by GPU count. Instance-level rates with non-standard GPU counts are documented individually. |
| US region | Where providers offer region-specific pricing, US East (or US region where East is unavailable) is used. Non-US-based providers are included at their default published rate, with the region noted. |
| No discount applied | Volume discounts, commitment discounts, reserved pricing, negotiated enterprise rates, and promotional rates are excluded. The Index tracks the stated rate a new customer would pay today. |
| Single-instance rental | Rates reflect a single instance of the tracked hardware, not multi-node clusters or bare-metal provisioning at scale. |
| Published rates only | Rates quoted privately, through sales channels only, or behind login walls are excluded. If a provider does not publish on-demand rates publicly, they are not included in the direct rental calculation. |
| Median calculation | The Index value is the median across the provider panel. Mean is not reported as a primary metric because it is distorted by the spread between hyperscalers and aggregators; the observed range is reported instead. |
| Weekly cadence | Rates are captured on Mondays in US-Eastern hours. The published Index reflects rates observed on Monday. Mid-week price changes are recorded for the following Monday. |
What the Index does not cover.
A responsible benchmark states its limits. The Compute Price Index measures a specific slice of the AI compute market and does not claim to measure the market as a whole.
The Index does not include providers who don't publish on-demand rates. A growing share of GPU compute is sold through enterprise contracts where pricing is confidential, customer-specific, and not directly comparable across the market. Providers like Nscale, IREN, IBM Cloud's enterprise tier, Fluidstack, and similar operators are economically significant — sometimes operating at hyperscaler scale — but their pricing is not measurable through public sources. We acknowledge them in our editorial coverage but exclude them from the Index calculation rather than misrepresent confidential quotes as observable rates.
The Index does not reflect committed-use, reserved, or spot pricing. Reserved contracts typically price 30 to 60 percent below on-demand rates depending on commitment terms. Spot or interruptible capacity can price 60 to 80 percent below on-demand. These tiers vary too much by buyer terms to support a stable median, and they are not the rate against which most marginal compute purchases are made.
The Index does not measure the realized rate enterprise buyers actually pay. Large customers negotiate against the published rate but typically pay less. The realized rate is calculable from public financials of listed providers (CoreWeave, Nebius, IREN) by dividing reported revenue by deployed GPU count — and we cover that calculation in editorial analysis — but it is a different metric than what the Index tracks.
The Index measures the on-demand, publicly-quoted, single-instance rate. This is the single most transparent and continuously measurable price signal in the AI compute market. It is the reference point against which every other tier is priced. It is what the Index tracks.
Methodology changes.
The provider panel, hardware coverage, normalization rules, and inference-implied assumptions are reviewed quarterly. Any change is disclosed in the Index issue it takes effect, with the reason for the change stated plainly. Changes that materially affect historical comparability are accompanied by a re-publication of prior values under the new methodology, so that week-over-week comparisons remain valid.
The methodology itself is not revised in response to specific observations. If a week's measured rates are surprising, we investigate the observation — not the methodology. The methodology exists to produce credible measurement regardless of what the measurement shows.