Red Hat OpenShift 4.21 Brings Smart GPU Allocation For AI Workloads

By Berry Zwets
Publication Date: 2026-02-04 13:25:00

Red Hat has launched OpenShift 4.21 with Dynamic Resource Allocation for GPUs, which allows high-end GPUs to be prioritized for AI training. These resources can also be scaled down completely to save money. The release also adds autoscaling to zero for hosted control planes and cross-cluster VM migration without downtime.

OpenShift 4.21 addresses a fundamental problem that AI teams face daily: GPU allocation that does not match their actual needs. Traditionally, teams simply requested a GPU and hoped it would meet their requirements. With the new Dynamic Resource Allocation, users specify exactly what they need, for example, “a GPU with at least 40GB VRAM.” The scheduler queries hardware attributes directly via common expression language to find the right resources.

This eliminates manual node labeling. The system reads hardware capabilities and automatically matches them to workload requirements. This feature does require a vendor-provided operator or driver with DRA support.

Cost optimization through autoscaling

Hosted control planes get native VerticalPodAutoscaler integration. Control plane components scale automatically based on real-time memory consumption rather than static estimates. In addition, control planes can now scale to zero during inactivity. The configuration and state are preserved, and they resume automatically when needed.

NodePools follow the same pattern and scale to zero nodes in development and test…

Cost optimization through autoscaling

Related Posts