Dynamic DAG generation with YAML and DAG Factory in Amazon MWAA
AWS Big Data Blog
by Jayesh Shinde
10h ago
Amazon Managed Workflow for Apache Airflow (Amazon MWAA) is a managed service that allows you to use a familiar Apache Airflow environment with improved scalability, availability, and security to enhance and scale your business workflows without the operational burden of managing the underlying infrastructure. In Airflow, Directed Acyclic Graphs (DAGs) are defined as Python code. Dynamic DAGs refer to the ability to generate DAGs on the fly during runtime, typically based on some external conditions, configurations, or parameters. Dynamic DAGs helps you to create, schedule, and run tasks withi ..read more
Visit website
How Salesforce optimized their detection and response platform using AWS managed services
AWS Big Data Blog
by Atul Khare
5d ago
This is a guest blog post co-authored with Atul Khare and Bhupender Panwar from Salesforce. Headquartered in San Francisco, Salesforce, Inc. is a cloud-based customer relationship management (CRM) software company building artificial intelligence (AI)-powered business applications that allow businesses to connect with their customers in new and personalized ways. The Salesforce Trust Intelligence Platform (TIP) log platform team is responsible for data pipeline and data lake infrastructure, providing log ingestion, normalization, persistence, search, and detection capability to ensure Salesfor ..read more
Visit website
Amazon OpenSearch Service Under the Hood : OpenSearch Optimized Instances(OR1)
AWS Big Data Blog
by Bukhtawar Khan
6d ago
Amazon OpenSearch Service recently introduced the OpenSearch Optimized Instance family (OR1), which delivers up to 30% price-performance improvement over existing memory optimized instances in internal benchmarks, and uses Amazon Simple Storage Service (Amazon S3) to provide 11 9s of durability. With this new instance family, OpenSearch Service uses OpenSearch innovation and AWS technologies to reimagine how data is indexed and stored in the cloud. Today, customers widely use OpenSearch Service for operational analytics because of its ability to ingest high volumes of data while also providing ..read more
Visit website
Power analytics as a service capabilities using Amazon Redshift
AWS Big Data Blog
by Sandipan Bhaumik
6d ago
Analytics as a service (AaaS) is a business model that uses the cloud to deliver analytic capabilities on a subscription basis. This model provides organizations with a cost-effective, scalable, and flexible solution for building analytics. The AaaS model accelerates data-driven decision-making through advanced analytics, enabling organizations to swiftly adapt to changing market trends and make informed strategic choices. Amazon Redshift is a cloud data warehouse service that offers real-time insights and predictive analytics capabilities for analyzing data from terabytes to petabytes. It off ..read more
Visit website
Introducing Amazon MWAA larger environment sizes
AWS Big Data Blog
by Hernan Garcia
1w ago
Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is a managed service for Apache Airflow that streamlines the setup and operation of the infrastructure to orchestrate data pipelines in the cloud. Customers use Amazon MWAA to manage the scalability, availability, and security of their Apache Airflow environments. As they design more intensive, complex, and ever-growing data processing pipelines, customers have asked us for additional underlying resources to provide greater concurrency and capacity for their tasks and workflows. To address this, today, we are announcing the availability ..read more
Visit website
Uplevel your data architecture with real- time streaming using Amazon Data Firehose and Snowflake
AWS Big Data Blog
by Swapna Bandla
1w ago
Today’s fast-paced world demands timely insights and decisions, which is driving the importance of streaming data. Streaming data refers to data that is continuously generated from a variety of sources. The sources of this data, such as clickstream events, change data capture (CDC), application and service logs, and Internet of Things (IoT) data streams are proliferating. Snowflake offers two options to bring streaming data into its platform: Snowpipe and Snowflake Snowpipe Streaming. Snowpipe is suitable for file ingestion (batching) use cases, such as loading large files from Amazon Simple S ..read more
Visit website
Achieve near real time operational analytics using Amazon Aurora PostgreSQL zero-ETL integration with Amazon Redshift
AWS Big Data Blog
by Raks Khare
1w ago
“Data is at the center of every application, process, and business decision. When data is used to improve customer experiences and drive innovation, it can lead to business growth,” – Swami Sivasubramanian, VP of Database, Analytics, and Machine Learning at AWS in With a zero-ETL approach, AWS is helping builders realize near-real-time analytics. Customers across industries are becoming more data driven and looking to increase revenue, reduce cost, and optimize their business operations by implementing near real time analytics on transactional data, thereby enhancing agility. Based on custom ..read more
Visit website
Amazon DataZone announces integration with AWS Lake Formation hybrid access mode for the AWS Glue Data Catalog
AWS Big Data Blog
by Utkarsh Mittal
2w ago
Last week, we announced the general availability of the integration between Amazon DataZone and AWS Lake Formation hybrid access mode. In this post, we share how this new feature helps you simplify the way you use Amazon DataZone to enable secure and governed sharing of your data in the AWS Glue Data Catalog. We also delve into how data producers can share their AWS Glue tables through Amazon DataZone without needing to register them in Lake Formation first. Overview of the Amazon DataZone integration with Lake Formation hybrid access mode Amazon DataZone is a fully managed data management ser ..read more
Visit website
How Aura from Unity revolutionized their big data pipeline with Amazon Redshift Serverless
AWS Big Data Blog
by Yonatan Dolan
2w ago
This post is co-written with  Amir Souchami and  Fabian Szenkier from Unity. Aura from Unity (formerly known as ironSource) is the market standard for creating rich device experiences that engage and retain customers. With a powerful set of solutions, Aura enables complete digital transformation, letting operators promote key services outside the store, directly on-device. Amazon Redshift is a recommended service for online analytical processing (OLAP) workloads such as cloud data warehouses, data marts, and other analytical data stores. You can use simple SQL to analyze structured a ..read more
Visit website
Automate large-scale data validation using Amazon EMR and Apache Griffin
AWS Big Data Blog
by Dipal Mahajan
2w ago
Many enterprises are migrating their on-premises data stores to the AWS Cloud. During data migration, a key requirement is to validate all the data that has been moved from source to target. This data validation is a critical step, and if not done correctly, may result in the failure of the entire project. However, developing custom solutions to determine migration accuracy by comparing the data between the source and target can often be time-consuming. In this post, we walk through a step-by-step process to validate large datasets after migration using a configuration-based tool using Amazon ..read more
Visit website

Follow AWS Big Data Blog on FeedSpot

Continue with Google
Continue with Apple
OR