Introducing bidirectional streaming for real-time inference on Amazon SageMaker AI | Amazon Web Services

Introducing bidirectional streaming for real-time inference on Amazon SageMaker AI | Amazon Web Services

In 2025, generative AI has evolved from text generation to multi-modal use cases ranging from audio transcription and translation to voice agents that require real-time data streaming. Today’s applications demand something more: continuous, real-time dialogue between users and models—the ability for data to flow both ways, simultaneously, over a single persistent connection. Imagine a speech to text use-case, where you will need to stream the audio stream as input and receive the transcripted text as a continuous stream. Such use-cases will require bi-directional streaming capability.

We’re introducing bidirectional streaming for Amazon SageMaker AI Inference, which transforms inference from a transactional exchange into a continuous conversation. Speech works best with real-time AI when conversations flow naturally without interruptions. With bidirectional streaming, speech to text becomes immediate. The model listens and transcribes at the same time, so…

https://aws.amazon.com/blogs/machine-learning/introducing-bidirectional-streaming-for-real-time-inference-on-amazon-sagemaker-ai/