Zephyrnet Logo

Tag: delta lake

Guide to Migrating from Databricks Delta Lake to Apache Iceberg

Introduction In the fast changing world of big data processing and analytics, the potential management of extensive datasets serves as a foundational pillar for companies...

Top News

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation | Amazon Web Services

With Amazon EMR 6.15, we launched AWS Lake Formation based fine-grained access controls (FGAC) on Open Table Formats (OTFs), including Apache Hudi, Apache Iceberg,...

Introducing Apache Hudi support with AWS Glue crawlers | Amazon Web Services

Apache Hudi is an open table format that brings database and data warehouse capabilities to data lakes. Apache Hudi helps data engineers manage complex challenges, such as...

Spark on AWS Lambda: An Apache Spark runtime for AWS Lambda | Amazon Web Services

Spark on AWS Lambda (SoAL) is a framework that runs Apache Spark workloads on AWS Lambda. It’s designed for both batch and event-based workloads,...

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi | Amazon Web Services

The Analytics specialty practice of AWS Professional Services (AWS ProServe) helps customers across the globe with modern data architecture implementations on the AWS Cloud....

Modern Data Engineering with MAGE: Empowering Efficient Data Processing

Introduction In today’s data-driven world, organizations across industries are dealing with massive volumes of data, complex pipelines, and the need for efficient data processing. Traditional...

Testing and Monitoring Data Pipelines: Part One – DATAVERSITY

Suppose you’re in charge of maintaining a large set of data pipelines from cloud storage or streaming data into a data warehouse. How can...

Data Warehouse vs. Data Lakehouse – DATAVERSITY

The phrase “data warehouse vs. data lakehouse” offers an exciting topic for ongoing debate in the global Data Management world. While businesses have relied on traditional data warehouses...

Implement slowly changing dimensions in a data lake using AWS Glue and Delta

In a data warehouse, a dimension is a structure that categorizes facts and measures in order to enable users to answer business questions. To...

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

In the first post of this series, we described how AWS Glue for Apache Spark works with Apache Hudi, Linux Foundation Delta Lake, and...

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Building data lakes from continuously changing transactional data of databases and keeping data lakes up to date is a complex task and can be...

AWS Lake Formation 2022 year in review

Data governance is the collection of policies, processes, and systems that organizations use to ensure the quality and appropriate handling of their data throughout...

Handle UPSERT data operations using open-source Delta Lake and AWS Glue

Many customers need an ACID transaction (atomic, consistent, isolated, durable) data lake that can log change data capture (CDC) from operational data sources. There...

Latest Intelligence

spot_img
spot_img