By Muhammad Zuhair
Publication Date: 2026-02-16 17:00:00
NVIDIA’s Blackwell Ultra is the modern-day computing option for hyperscalers, and in newer benchmarks, the GB300 NVL72 shows immense performance in low-latency and long context workloads.
NVIDIA’s Blackwell Ultra AI Racks Now Feature Top-Tier Agentic Performance, Driven By NVLink Upgrades
The AI industry has evolved across multiple layers since its original boom back in 2022, and right now, we are seeing a major shift towards agentic computing, driven by applications/wrappers built on frontier models. At the same time, for infrastructure providers like NVIDIA, it has become increasingly important to have ample memory bandwidth and performance onboard to meet the latency requirements of agentic frameworks, and with Blackwell Ultra, Team Green has done just that. In a new blog post, NVIDIA tested Blackwell Ultra on SemiAnalysis’s InferenceMAX, and the results are astonishing.
NVIDIA’s first infographic emphasizes a figure called “token/watt”, which is probably one of the world’s most important numbers to look at with the current hyperscaler buildout. The company has focused on both raw performance and throughput optimizations, which is why, with GB300 NVL72, NVIDIA sees a 50x increase in throughput per megawatt compared to Hopper GPUs. The comparison below shows the best possible ‘deployed state’ for each architecture.
If you are curious about how the throughput-per-megawatt gains are so phenomenal, well, NVIDIA takes pride in its…

