Upgrade your data monitoring with Amazon OpenSearch Service’s seamless integration with Amazon S3, eliminating the need for ETL processes | Amazon Web Services

Upgrade your data monitoring with Amazon OpenSearch Service’s seamless integration with Amazon S3, eliminating the need for ETL processes | Amazon Web Services



Amazon OpenSearch Service has announced the availability of zero-ETL integration with Amazon Simple Storage Service (Amazon S3) for domains running 2.13 and above. This integration allows customers to query operational logs in Amazon S3 without switching between tools. Users will be able to perform forensic analysis of operational and security events by evaluating multiple data sources across OpenSearch Service and S3 datasets. This integration supports AWS’s zero-ETL vision, reducing operational complexity and enabling direct query access to operational data, saving time and costs.

OpenSearch, derived from Elasticsearch 7.10, is a distributed search and analytics suite with tens of thousands of active customers and hundreds of thousands of clusters managing trillions of requests per month. Amazon S3, an object storage service, provides scalability, data security, and performance for various use cases such as data lakes, cloud applications, and mobile apps. Cost-effective storage classes and user-friendly features enable organizations to optimize costs and meet specific compliance requirements.

With zero-ETL integration, users can leverage OpenSearch Service SQL and PPL analytics on data stored in Amazon S3. Direct query integration enables analysis of operational logs and data lakes in Amazon S3, eliminating the need for data duplication and complex ETL pipelines. Customers can directly access operational analytics and visualizations within OpenSearch Service. Popular AWS log types stored in Amazon S3, such as VPC Flow Logs and AWS Load Balancer Logs, can be easily queried and analyzed using OpenSearch Service with the new integration.

A real-world example from financial services provider Arcesium showcases the benefits of this integration, allowing for the analysis of large volumes of log data stored in Amazon S3 without the need for costly online OpenSearch clusters. By setting up direct queries with Amazon S3, users can ingest only metadata into OpenSearch Service, accelerating query performance and unlocking advanced analytics capabilities. Covering indexes, materialized views, and skipping indexes are key concepts that optimize query performance and cost efficiency.

The direct query feature utilizes OpenSearch Compute Units (OCUs) to charge for resources consumed by workloads, making it ideal for infrequently queried data. Users can create budgets and set alerts for OCU usage to manage costs effectively. The integration also provides various prebuilt dashboards, visualizations, and mapping templates, simplifying data exploration and analysis. Overall, the zero-ETL integration with Amazon S3 provides users with a powerful tool for event analysis, enabling real-time analysis and cost-effective storage for post-event analysis and correlation.

Article Source
https://aws.amazon.com/blogs/big-data/modernize-your-data-observability-with-amazon-opensearch-service-zero-etl-integration-with-amazon-s3/