Nvidia CEO says AI agents mark new era, unveils Vera CPU

Reporter Elaine Lin / TVBS News staff
Release time：2026/06/01 16:00
Last update time：2026/06/01 18:21

Nvidia Vera CPU: Built for AI Agents, Not Just AI (TVBS News)

TAIPEI (TVBS News) — Nvidia CEO Jensen Huang (黃仁勳) said at GTC Taipei on Monday (June 1) that the company's next-generation computing platform, "Vera Rubin," powered by NVLink 6 in the NVL72 rack configuration, is designed not just to run AI, but to power "AI agents," signaling what he described as a new era of intelligent autonomous systems. He also outlined four core advantages of the new "Vera CPU."

Huang said AI agents represent the final breakthrough in computer science, moving beyond answering questions to acting as autonomous entities capable of using tools, accessing databases, and executing complex workloads. However, this shift poses fundamental challenges to traditional computing architectures.

"All CPUs in the past were built for humans, but humans and agents have fundamentally different economics and usage models," Huang said. "Humans operate in a world measured in seconds and can tolerate waiting, while agents are extremely impatient and live in a world of nanoseconds. When an agent is executing a task, even a moment of delay hinders its progression to the next step. In the agent era, legacy CPU architectures have become a critical bottleneck limiting GPU utilization, directly affecting token throughput and user experience."

Huang said Vera Rubin is in full production (TVBS News)

A CPU Built for AI Agents: Four Key Advantages of Nvidia's Vera

Custom Olympus Cores: Featuring a 10-wide decode unit, advanced neural branch predictor, and deep out-of-order execution, the custom Armv9.2 Olympus cores deliver up to 50% higher IPC than Nvidia Grace, drastically reducing latency for control-heavy agentic workloads.

High-Bandwidth Memory Subsystem: Vera CPU delivers up to 1.2 TB/s of LPDDR5X memory bandwidth and achieves 40% lower peak memory latency compared with x86 CPUs, ensuring cores stay fed through tool calls, sandboxed execution, data retrieval, and orchestration.

Scalable Coherency Fabric (SCF): The CPU uses Nvidia's monolithic mesh SCF to connect all 88 cores and a unified cache, delivering predictable latency and 50% faster core-to-core data movement than CPUs that fragment compute across dies.

Exceptional Energy Efficiency: With SOCAMM LPDDR5X memory consuming under 30 W (compared to over 100 W for DDR5) and a configurable 250–450 W TDP, Vera reduces CPU and memory power while maintaining the bandwidth needed for agentic inference and reinforcement learning.

Huang noted that previous systems required extensive cabling and took two hours to assemble, whereas the new Vera Rubin platform features tightly integrated co-design with no internal cables. Components are connected via a central PCB, reducing full-rack hardware assembly and maintenance time from two hours to just five minutes, while significantly improving system reliability and resilience.