By Claudio Masolo
Publication Date: 2026-03-19 10:00:00
The Azure Kubernetes Service team shared a detailed guide on how to use Dynamic Resource Allocation (DRA) with NVIDIA vGPU technology on AKS. his update improves control and efficiency for shared GPU use in AI and media tasks.
Dynamic Resource Allocation (DRA) is now the standard for GPU resource use in Kubernetes. Instead of static resources like nvidia.com/gpu, GPUs are allocated dynamically using DeviceClasses and ResourceClaims. This change enhances scheduling and improves integration with virtualization technologies like NVIDIA vGPU.
The reason for combining these technologies is clear: virtual accelerators like NVIDIA vGPU often handle smaller tasks. They allow one physical GPU to be split among many users or applications. This setup is helpful for enterprise AI/ML development, fine-tuning, and audio/visual processing. vGPU offers predictable performance while still providing CUDA capabilities to containerized workloads.
On the infrastructure side, this feature…