The NVIDIA GB200 NVL72 is a rack-scale, liquid-cooled exascale computer designed for real-time trillion-parameter large language model (LLM) inference and massive-scale AI training. It integrates 36 Grace CPUs and 72 Blackwell GPUs into a single NVLink domain, functioning as a unified, massive GPU for high-performance computing (HPC) workloads.
- Architecture: NVIDIA Blackwell with 72 GPUs and 36 Grace CPUs interconnected via NVLink-C2C.
- Interconnect: Fifth-generation NVIDIA NVLink providing 130 TB/s aggregate GPU communication bandwidth.
- Performance: Delivers up to 1,440 PFLOPS FP4 Tensor Core performance and 720 PFLOPS FP8.
- Memory: 13.4 TB HBM3E GPU memory with 576 TB/s aggregate memory bandwidth.
- CPU Specs: 2,592 Arm Neoverse V2 cores with 17 TB LPDDR5X system memory and 14 TB/s bandwidth.
- Cooling: Integrated liquid-cooling system designed to reduce carbon footprint and increase compute density.
- Networking: Support for NVIDIA Quantum-X800 InfiniBand and Spectrum-X800 Ethernet platforms.
- Inference: 30x faster real-time LLM inference performance for trillion-parameter language models.
- Training: 4x faster training performance for large-scale mixture-of-experts (MoE) architectures.
- Management: NVIDIA Mission Control for full-stack intelligence and infrastructure resilience.
Related Products
A relatable product