Speeding up PyTorch inference using torch.compile on AWS Graviton processors | Amazon Web Services
PyTorch 2.0 introduced torch.compile to accelerate PyTorch code compared to the default eager mode, resulting in up to 2 times better performance for Hugging Face Model inference and up to 1.35 times better performance for Torch bank Model inference on various models on AWS Graviton3. AWS optimized PyTorch’s Torch.compile function for Graviton3 to achieve these … Read more