Site icon VMVirtualMachine.com

Nvidia’s closest rival once again obliterates cloud giants in AI performance; Cerebras Inference is 75x faster than AWS, 32x faster than Google on Llama 3.1 405B

Nvidia’s closest rival once again obliterates cloud giants in AI performance; Cerebras Inference is 75x faster than AWS, 32x faster than Google on Llama 3.1 405B
Spread the love

  • Cerebras hits 969 tokens/second on Llama 3.1 405B, 75x faster than AWS
  • Claims industry-low 240ms latency, twice as fast as Google Vertex
  • Cerebras Inference runs on the CS-3 with the WSE-3 AI processor

Cerebras Systems says it has set a new…

Article Source
https://www.techradar.com/pro/nvidias-closest-rival-once-again-obliterates-cloud-giants-in-ai-performance

Exit mobile version