NVIDIA and Google Cloud Unveil Next-Gen AI Infrastructure for Agentic and Physical AI

Breaking News: AI Infrastructure Leap at Google Cloud Next

In a major development at Google Cloud Next in Las Vegas, NVIDIA and Google Cloud announced a new generation of AI infrastructure designed to bring agentic and physical AI into production. The centerpiece is the new A5X bare-metal instances powered by NVIDIA Vera Rubin, which promise up to 10x lower inference cost per token and 10x higher token throughput per megawatt compared to the previous generation.

NVIDIA and Google Cloud Unveil Next-Gen AI Infrastructure for Agentic and Physical AI — Source: blogs.nvidia.com

“At Google Cloud, we believe the next decade of AI will be shaped by customers’ ability to run their most demanding workloads on a truly integrated, AI‑optimized infrastructure stack,” said Mark Lohmeyer, vice president and general manager of AI and computing infrastructure at Google Cloud. He emphasized that the combination of Google Cloud’s scalable infrastructure and NVIDIA’s platforms gives customers flexibility to train, tune, and serve everything from frontier models to agentic and physical AI workloads.

Background: A Decade of Co-Engineering

NVIDIA and Google Cloud have collaborated for over a decade, co-engineering a full-stack AI platform that covers performance-optimized libraries, frameworks, and enterprise-grade cloud services. This foundation has enabled developers and enterprises to push agentic AI—autonomous systems managing complex workflows—and physical AI, such as robots and digital twins, from the lab into real-world production.

The new announcements build on this long-standing partnership, moving beyond previous Blackwell-based offerings to include the next-generation Vera Rubin architecture. The A5X instances will use NVIDIA ConnectX-9 SuperNICs combined with next-generation Google Virgo networking, scaling to up to 80,000 NVIDIA Rubin GPUs in a single site cluster and up to 960,000 in a multisite cluster.

Key Announcements: From Vera Rubin to Agentic AI

At the event, Google unveiled several new offerings:

A5X instances with NVIDIA Vera Rubin NVL72: These rack-scale systems deliver extreme performance improvements, enabling customers to run the largest AI workloads on NVIDIA-optimized infrastructure.
Preview of Google Gemini on Google Distributed Cloud: Running on NVIDIA Blackwell and Blackwell Ultra GPUs, this brings advanced AI capabilities to edge environments.
Confidential VMs with NVIDIA Blackwell GPUs: Enhance security for sensitive AI workloads.
Agentic AI on Gemini Enterprise Agent Platform: Integrated with NVIDIA Nemotron open models and the NVIDIA NeMo framework, enabling developers to build and deploy intelligent agents.

Google Cloud also expanded its NVIDIA Blackwell portfolio, offering a range from A4 VMs with HGX B200 systems to rack-scale A4X VMs with GB200 NVL72 and A4X Max with GB300 NVL72, down to fractional G4 VMs with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. This allows customers to right-size acceleration, whether scaling out to tens of thousands of Blackwell GPUs or using a single rack with 72 GPUs.

What This Means

For enterprises and startups, this means a significant reduction in the cost and energy required to run large AI models. The ability to scale to nearly a million GPUs in a multisite cluster opens the door to training and inference at unprecedented scales. For the broader AI ecosystem, the integration of Gemini with NVIDIA Nemotron models provides a ready-made platform for building agentic AI systems that can manage complex workflows autonomously.

Physical AI applications—like factory robots and digital twins—will benefit from the enhanced performance and lower latency. As NVIDIA and Google Cloud continue to co-engineer across chips, systems, and software, the path from research to production for agents and physical AI becomes clearer and more accessible.

Industry analysts see this as a pivotal moment. “The combination of Vera Rubin’s raw performance and Google Cloud’s networking capabilities sets a new standard for AI infrastructure,” said a senior analyst at Gartner. “It’s not just about speed; it’s about enabling workloads that were previously impossible due to cost or complexity.”

With these tools, developers can now deploy agentic AI in customer service, supply chain management, and healthcare, while manufacturers can simulate factory floors with digital twins that update in real time. The race to bring AI out of the lab and into the real world just got faster.

NVIDIA and Google Cloud Unveil Next-Gen AI Infrastructure for Agentic and Physical AI

Breaking News: AI Infrastructure Leap at Google Cloud Next

Background: A Decade of Co-Engineering

Key Announcements: From Vera Rubin to Agentic AI

What This Means

Recommended

Discover More