xAI’s 100,000 H100 Colossus is glued together using Ethernet

xAI’s 100,000 H100 Colossus is glued together using Ethernet

Unlike most AI training clusters, xAI’s Colossus with its 100,000 Nvidia Hopper GPUs doesn’t use InfiniBand. Instead, the massive system, which Nvidia bills as the “world’s largest AI supercomputer,” was built using the GPU giant’s Spectrum-X…

Article Source
https://www.theregister.com/2024/10/29/xai_colossus_networking/