Akamai deploys Nvidia Blackwell GPUs to build distributed AI inference platform

Infrastructure will run AI workloads across Akamai’s global network to reduce latency and support enterprise-scale inference.

AI Chip technology concept. 3D render

Akamai Technologies is deploying thousands of Nvidia Blackwell GPUs across its distributed cloud infrastructure to build a global platform designed for AI inference workloads.

The deployment will enable enterprises to run AI applications closer to end users by routing workloads to compute resources distributed across Akamai’s global network.

It will also support model fine-tuning and post-training optimisation on the same infrastructure.

The platform aims to reduce the latency and data-egress challenges that often arise when AI workloads are processed through centralised data centres.

The initiative reflects a broader shift in the AI industry, where the focus is expanding from model training in large AI factories to inference at scale across distributed environments.

Akamai said the infrastructure will support a unified platform for AI research, development and deployment while enabling enterprises to run real-time AI applications closer to where data is generated.

The company’s architecture treats its global network as a distributed compute layer, enabling rapid AI inference for applications that require immediate responses.

These include use cases such as autonomous delivery systems, smart grid operations, surgical robotics and real-time fraud detection.

The platform integrates AI infrastructure from Nvidia, combining RTX PRO servers with RTX PRO 6000 Blackwell Server Edition GPUs and BlueField-3 DPUs.

These systems will run on Akamai’s distributed cloud computing infrastructure and edge network, which spans more than 4,400 locations worldwide.

The infrastructure is designed to deliver predictable, high-performance inference by processing AI workloads on dedicated GPU clusters while enabling localised fine-tuning of large language models.

Could reduce latency and AI inference costs

Localised optimisation allows organisations to adapt AI models using proprietary datasets while addressing data privacy and regional compliance requirements.

Enterprises will also be able to perform post-training optimisation of foundation models to improve accuracy and performance for specific workloads.

The deployment builds on Akamai’s broader push into AI infrastructure following the launch of its inference cloud platform in 2025.

The company said the platform enables developers and platform engineers to build and run AI applications closer to end users and devices.

By running inference workloads across distributed infrastructure, Akamai claims the architecture can reduce latency by up to 2.5 times compared with traditional centralised deployments.

The company also said organisations could reduce AI inference costs by as much as 86 percent when compared with hyperscale cloud infrastructure.

Akamai reported strong demand for its initial Blackwell GPU deployment and said it will continue expanding GPU capacity as part of its distributed cloud strategy.