By Chris Mellor
Publication Date: 2026-05-26 14:59:00
Qumulo says its Cloud AI Accelerator offering gets data from distributed on-prem and public cloud sites to GPU accelerators without it needing to be copied and staged to all-flash stirage closely coupled to the GPU servers.
It tells us that, according to a recent analysis, the average enterprise GPU utilization hovers around a staggering 5 percent. This means hundreds of billions of dollars’ worth of accelerated compute infrastructure sits idle roughly 95 percent of the time because data must be staged, replicated, and moved into position before a workload can even start. Improved tokenomics has to consider total creation time, not just the last mile.
Qumulo CEO Doug Gourlay said: “Every enterprise we talk to is focused on GPU availability, but availability is only half the problem. The deeper issue is utilization, and the culprit is data gravity.”
“The industry’s response has been to sell enterprises more tightly-coupled storage attached directly to GPU…