xAI’s 100,000 H100 Colossus is glued together using Ethernet

Spread the love

Unlike most AI training clusters, xAI’s Colossus with its 100,000 Nvidia Hopper GPUs doesn’t use InfiniBand. Instead, the massive system, which Nvidia bills as the “world’s largest AI supercomputer,” was built using the GPU giant’s Spectrum-X…

Article Source
https://www.theregister.com/2024/10/29/xai_colossus_networking/