By Muhammad Zuhair
Publication Date: 2026-01-01 19:07:00
NVIDIA’s Blackwell GB200 NVL72 AI racks have been tested in an MoE (Mixture of Experts) environment, and based on a report, they manage to outperform AMD’s Instinct MI355X by a huge margin.
NVIDIA’s “Extreme Co-Design” Laws Give the Company an Upper Hand In MoE Architectures, Widening the Gap with AMD
AI models are shifting rapidly towards an MoE-focused landscape, mainly since it allows for a much more efficient utilization of compute resources; however, scaling them up introduces a massive computing bottleneck compared to dense models. Since the MoE focuses on operating separate sub-networks labeled as ‘experts’, it requires tremendous all-to-all communication and data transfer between nodes, which induces latency issues and bandwidth pressure. Hyperscalers are seeking the best performance-per-dollar solution available, and according to an analysis by Signal65, NVIDIA’s GB200 NVL72 is the go-to option for MoE architectures.
Quoting benchmarks from SemiAnalysis’s InferenceMAX, the report mentions that NVIDIA’s Blackwell AI servers have brought in 28 times higher throughput per GPU (75 tokens/sec), compared to AMD’s MI355X in a similar cluster configuration, and if you are curious about why the performance difference is so significant, well, NVIDIA has answered this earlier. To address the performance bottlenecks involved in scaling MoE AI models, NVIDIA has employed the ‘co-design’ approach, which consists in…