SEATTLE and SUNNYVALE, Calif., March 16, 2026 — Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company, and Cerebras Systems have announced a collaboration that the companies claim will, in the coming months, deliver the fastest AI inference solutions available for generative AI applications and LLM workloads. The solution, to be deployed on Amazon Bedrock in AWS data centers, combines AWS Trainium-powered servers, Cerebras CS-3 systems, and Elastic Fabric Adapter (EFA) networking. Later this year, AWS will also offer leading open-source LLMs and Amazon Nova using Cerebras hardware.
“Inference is where AI delivers real value to customers, but speed remains a critical bottleneck for demanding workloads like real-time coding assistance and interactive applications,” said David Brown, Vice President, Compute & ML Services, AWS. “What we’re building with Cerebras…