At re:Invent 2025, we announced serverless storage for Amazon EMR Serverless, eliminating the need to provision local disk storage for Apache Spark workloads. Serverless storage of Amazon EMR Serverless reduces data processing costs by up to 20% while helping prevent job failures from disk capacity constraints.
In this post, we explore the cost improvements we observed when benchmarking Apache Spark jobs with serverless storage on EMR Serverless. We take a deeper look at how serverless storage helps reduce costs for shuffle-heavy Spark workloads, and we outline practical guidance on identifying the types of queries that can benefit most from enabling serverless storage in your EMR Serverless Spark jobs.
Benchmark results for EMR 7.12 with serverless storage against standard disks
We conducted the performance and cost savings benchmarking using the TPC-DS dataset at 3TB scale, running 100+ queries that included a mix of high and low shuffle operations….

