Google Extends Kubernetes Service to Safely Run Agentic AI Workloads

vm_admin

7 months ago

Google Extends Kubernetes Service to Safely Run Agentic AI Workloads

By Mike Vizard
Publication Date: 2025-11-13 12:43:00

Google this week at the KubeCon + CloudNativeCon North America 2025 conference revealed it is making available of a technical preview of a sandbox capability on the Google Kubernetes Engine (GKE) service that can be used to optimally run and secure agentic artificial intelligence (AI) workloads.

Additionally, Google is now making available a GKE Inference Gateway that reduces Time-to-First-Token (TTFT) latency by 96% and token costs by as much as 25%. Google has also added a Pod Snapshots capability that also makes it simpler to restore pods in the event of a node failure.

Google is also adding a GKE Buffers application programming interface (API) for near-instant capacity and an Autopilot compute class for standard clusters to streamline provisioning of infrastructure resources.

Finally, Google also announced that Google Cloud is doubling the capacity of GKE to now support 130,000 node clusters.

Dave Bartoletti, a senior product manager for Google Cloud, said…