Migrating enterprise ML workloads from Databricks to AWS for large scale ML | Amazon Web Services

Migrating enterprise ML workloads from Databricks to AWS for large scale ML | Amazon Web Services

Machine learning (ML) models operate directly in the critical path of ad delivery, influencing bidding, pricing, and campaign optimization under strict latency, reliability, and correctness requirements. These models are trained frequently on large volumes of historical auction data and produce deterministic artifacts that downstream serving systems rely on for consistent behavior in production.

Historically, Kargo, a digital advertising platform that powers real-time decisioning across billions of ad auctions every day, implemented their ML pipelines on Databricks. Spark-based notebooks handled data provisioning from Snowflake, feature aggregation, model fitting, and optimization, with intermediate datasets stored in Delta Lake tables. This approach enabled rapid iteration early on, but as pipelines expanded across multiple models and execution dimensions, it introduced increasing operational complexity.

As part of this evolution, shared modeling logic was…

https://aws.amazon.com/blogs/industries/migrating-enterprise-ml-workloads-from-databricks-to-aws-for-large-scale-ml/