By Amin Vahdat
Publication Date: 2026-04-22 00:00:00
Co-designed for twins, open to everyone
This eighth-generation TPU is also the latest expression of our co-design philosophy, where every specification is designed to overcome AI’s biggest hurdles.
- The Boardfly topology is designed specifically to address the communication needs of today’s most powerful reasoning models.
- The SRAM capacity in TPU 8i is sized for the KV cache footprint of production-scale reasoning models.
- The bandwidth targets of the Virgo Network Fabric were derived from the concurrency requirements of trillion-parameter training.
And for the first time, both chips run on Google’s own Axion ARM-based CPU host, allowing us to optimize the entire system, not just the chip, for performance and efficiency.
Both platforms support native JAX, MaxText, PyTorch, SGLang and vLLM – the frameworks developers already use – and provide bare metal access, giving customers direct hardware access without the overhead of virtualization. Open source contributions, including MaxText reference implementations and…