Akamai

Akamai Inference Cloud

The first global-scale AI inference grid — built on NVIDIA's AI Grid reference architecture, delivered through MobileRider

Inference at the edge, not in a distant data center

Akamai Inference Cloud combines NVIDIA RTX PRO™ Servers — featuring NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, NVIDIA BlueField-3® DPUs, and NVIDIA AI Enterprise software — with Akamai's distributed cloud computing infrastructure and global edge network, which has more than 4,400 locations worldwide.

It is the first global-scale implementation of NVIDIA's AI Grid reference architecture, designed to spread inference work across data centers, regional cloud sites, and edge locations — reducing latency and improving cost efficiency for workloads that require real-time, consistent responses.

Why a grid instead of a region

Centralized GPU clusters are essential for training models, but they are too slow, too remote, and too rigid for the inference phase — the actual execution of AI in real-time environments. Akamai's own research found that 64% of organizations now require end-to-end response times under 250 milliseconds, while 50% of deployments fail to meet latency demands at peak load. Read the State of AI Inference findings.

The grid answers this with a workload-aware orchestrator that brokers AI requests across compute tiers based on demand and location — routing each inference to the optimal point in the network rather than backhauling everything to a single region.

What runs on the grid

  • Agentic and conversational AI — real-time multistep reasoning close to users in every region
  • Live media processing — transcoding, AI upscaling, object detection, and dynamic ad insertion in a single workflow
  • In-game AI — non-player character interactions at the speed gameplay demands
  • Fraud detection and safety systems — decisions in milliseconds, processed where data originates
  • Physical AI and computer vision — camera and sensor streams processed at the edge, respecting data sovereignty

The proof

Benchmarks show RTX PRO 6000 Blackwell on Akamai Cloud delivers up to 1.63x higher inference throughput than the H100, achieving 24,240 TPS per server at 100 concurrent requests — see the full benchmark methodology and results. In production beta, Harmonic processed 300 images in under a minute with GPU memory use below 10% — read the case study.

Access through MobileRider

As an Akamai Preferred Partner, MobileRider provisions Blackwell GPU instances on Akamai Inference Cloud at the same price as buying direct — with 24/7/365 white-glove management, deployment support, and a decade of experience running mission-critical media workloads on Akamai's network. See transparent pricing or find the right GPU for your workload.

Related reading

Put your inference on the grid

NVIDIA RTX PRO™ 6000 Blackwell on Akamai Inference Cloud — limited availability, provisioned by MobileRider.

Get access