NVIDIA Wins Every MLPerf Training V5.1 Benchmark

By TechPowerUp
Publication Date: 2025-11-12 17:31:00

In the age of AI reasoning, training smarter, more capable models is critical to scaling intelligence. Delivering the massive performance to meet this new age requires breakthroughs across GPUs, CPUs, NICs, scale-up and scale-out networking, system architectures, and mountains of software and algorithms. In MLPerf Training v5.1—the latest round in a long-running series of industry-standard tests of AI training performance—NVIDIA swept all seven tests, delivering the fastest time to train across large language models (LLMs), image generation, recommender systems, computer vision and graph neural networks.

NVIDIA was also the only platform to submit results on every test, underscoring the rich programmability of NVIDIA GPUs, and the maturity and versatility of its CUDA software stack.

NVIDIA Blackwell Ultra Doubles Down
The GB300 NVL72 rack-scale system, powered by the NVIDIA Blackwell Ultra GPU architecture, made its debut in MLPerf Training this round, following a record-setting showing in the most recent MLPerf Inference round.

Compared with the prior-generation Hopper architecture, the Blackwell Ultra-based GB300 NVL72 delivered more than 4x the Llama 3.1 405B pretraining and nearly 5x the Llama 2 70B LoRA fine-tuning performance using the same number of GPUs.

These gains were fueled by Blackwell Ultra’s architectural improvements—including new Tensor Cores that offer 15 petaflops of NVFP4 AI compute, twice the attention-layer compute and 279 GB of HBM3e memory—as…

Related Posts