Wednesday, December 4, 2024

Nvidia’s closest rival once again obliterates cloud giants in AI performance; Cerebras Inference is 75x faster than AWS, 32x faster than Google on Llama 3.1 405B

-

Spread the love

  • Cerebras hits 969 tokens/second on Llama 3.1 405B, 75x faster than AWS
  • Claims industry-low 240ms latency, twice as fast as Google Vertex
  • Cerebras Inference runs on the CS-3 with the WSE-3 AI processor

Cerebras Systems says it has set a new…

Article Source
https://www.techradar.com/pro/nvidias-closest-rival-once-again-obliterates-cloud-giants-in-ai-performance

[td_block_social_counter facebook="TagDiv" twitter="tagdivofficial" youtube="tagdiv" style="style4 td-social-colored" custom_title="FOLLOW US" block_template_id="td_block_template_2" f_header_font_family="445" f_header_font_size="18" f_header_font_line_height="1.4" f_header_font_transform="uppercase" header_text_color="#f45511" f_header_font_weight="400" tdc_css="eyJhbGwiOnsiZGlzcGxheSI6IiJ9LCJwb3J0cmFpdCI6eyJtYXJnaW4tYm90dG9tIjoiNDAiLCJkaXNwbGF5IjoiIn0sInBvcnRyYWl0X21heF93aWR0aCI6MTAxOCwicG9ydHJhaXRfbWluX3dpZHRoIjo3Njh9"]
spot_img

Related Stories