
Tech • IA • Crypto
NVIDIA has launched Vera Rubin, a multi-rack, pod-scale supercomputer designed to power agentic AI systems that can autonomously reason, plan, and act.
AI development is moving beyond simple response generation toward “agentic” systems capable of observing, reasoning, planning, and using tools. These systems require managing vast context, combining working and long-term memory, and coordinating specialized sub-agents. This shift significantly increases computational and architectural demands.
Vera Rubin is described as the first multi-rack, pod-scale supercomputer built specifically for agentic AI workloads. It integrates multiple interconnected rack-scale systems into a unified platform designed to handle large-scale reasoning, orchestration, and memory-intensive processing.
The system is built on seven new chips manufactured using TSMC’s 3-nanometer process, with advanced packaging and HBM4 memory from Micron, SK hynix, and Samsung. A single compute board contains 6 trillion transistors and over 18,000 components, highlighting the scale of integration required.
Each system includes 18 compute trays and NVLink switch trays, connected without cables for resilience. Liquid-cooled infrastructure delivers over 5,000 amps, equivalent to the power draw of 20 electric cars at full acceleration. Networking is powered by ConnectX-9, BlueField-4 DPUs, and Spectrum-X photonic Ethernet.
The MVL72 system handles reasoning, planning, and high-throughput token generation, while Grace CPU racks with 256 CPUs orchestrate workloads and memory. Complementing this, Groq LPX systems deliver ultra-low-latency inference with 40 petabytes per second of SRAM bandwidth.
Production spans 150 supply chain partners across Taiwan, involving millions of square feet of manufacturing space and hundreds of facilities. Companies including Foxconn, Quanta, Microsoft, Dell, and CoreWeave are already deploying engineering racks.
Vera Rubin represents a major leap in AI infrastructure, combining extreme-scale hardware and global manufacturing coordination to support the next generation of autonomous, agent-driven computing systems.
Large language models generate answers. Now AI agents can do work. But processing agentic AI is a whole different kind of problem. Agents observe, reason, plan, use tools. They manage massive context, juggling working memory and long-term memory. They spin up sub aents, specialists on demand. NVIDIA Vera Rubin is a multi-rackck podscale system built to process Agentic AI and is now in full production. The manufacturing automation and orchestration across the supply chain a miracle to witness. Our journey started when we launched the first AI supercomputer Nvidia DGX1. Over the next decade, we pushed every chip and system to the limit. from Pascal and the first MVLink to Grace Blackwell the first rack scale AI supercomputer and now Vera Rubin the first multirack pods scale supercomputer built for the agentic age it starts at TSMC the seven new chips that make up Vera Rubin take shape through hundreds of processing steps three nanometer process co-wr packaging HBM4 memory from Micron SKH highinix and Samsung the Vera Rubin compute board 6 trillion transistors with over 18,000 components on one board. Vera Rubin MVL72 does the thinking prompt and context understanding reasoning and planning. Next, a new modular compute tray streamlined with a new PCB midplane design, super chips, connect X9 Super Nix, and Bluefield 4 DPUs, all mate in place with no cables for resiliency at AI factory scale, 18 compute trays, nine hot swappable NVLink switch trays, new high efficiency manifolds, liquid cooled bus bars carrying over 5,000 amps, the equivalent of 20 electric cars at full acceleration. Together, 1.3 million components formed this third generation MGX rack design. Congratulations to Microsoft for their operational Vera Rubin MVL72 engineering rack. Congratulations to Dell and Coreweave as well for standing up their Vera Rubin MVL72 engineering rack. Then the Vera CPU rack. 256 CPUs in a single liquid cooled rack. Orchestrating the models, shuffling memory, launching tools. At Foxcon and Quanta, Gro 3 LPX takes shape. 256 Gro 3 LUS across 16 trays, 40 pabytes per second of SRAM bandwidth for ultra low latency. While MVL72 generates tokens at the highest throughput, Grock LPX generates them at the lowest latency. Vera Bluefield 4 STX, where AI keeps its memory, storage processing accelerated by Bluefield 4, connecting memory, storage, and insilicon security. and NVIDIA Spectrum X Ethernet photonix. The world's first Ethernet switch with 200 gigabit co-ackaged optics. TSMC's coupe process chip scale packaging and ultra highowered laser dies on indium phosphide. Vera Rubin five connected rack scale systems a supercomput for AI agents. 150 supply chain partners across Taiwan. Millions of square feet of factory floor, hundreds of sites, chips, packages, systems, and data centers pushed to the limits of size, power, and scale. This is what we call extreme code design. We did this with Taiwan. Together, we reinvented computing for the age of AI. Taiwan was with us at the beginning and here today as we bring Vera Rubin to the world. Thank you, Taiwan.