Cerebras

vm_admin

1 month ago

The world’s fastest inference is coming to the world’s leading cloud. Today we’re announcing that Amazon Web Services is deploying Cerebras CS-3 systems in AWS data centers. Available via AWS Bedrock, the new service will offer leading open-source LLMs and Amazon’s Nova models running at the industry’s highest inference speed. In addition, AWS and Cerebras are collaborating on a new disaggregated architecture that pairs AWS Trainium with Cerebras WSE to deliver 5x more high-speed token capacity in the same hardware footprint.

The Need for Fast Inference

AI is reshaping software development. Code is increasingly written by AI agents rather than by human developers. Unlike conversational chat, agentic coding generates approximately 15x more tokens per query and demands high-speed token output to keep developers productive. The result is an urgent and growing need for more fast inference across the industry.

Cerebras is the market leader in high-speed AI inference, powering…

https://www.cerebras.ai/blog/cerebras-is-coming-to-aws