Zephyrnet Logo

Tag: apache hive

How Amazon optimized its high-volume financial reconciliation process with Amazon EMR for higher scalability and performance | Amazon Web Services

Account reconciliation is an important step to ensure the completeness and accuracy of financial statements. Specifically, companies must reconcile balance sheet accounts that could...

Top News

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation | Amazon Web Services

With Amazon EMR 6.15, we launched AWS Lake Formation based fine-grained access controls (FGAC) on Open Table Formats (OTFs), including Apache Hudi, Apache Iceberg,...

GoDaddy benchmarking results in up to 24% better price-performance for their Spark workloads with AWS Graviton2 on Amazon EMR Serverless | Amazon Web Services

This is a guest post co-written with Mukul Sharma, Software Development Engineer, and Ozcan IIikhan, Director of Engineering from GoDaddy. GoDaddy empowers everyday entrepreneurs...

Orchestrate Amazon EMR Serverless jobs with AWS Step functions | Amazon Web Services

Amazon EMR Serverless provides a serverless runtime environment that simplifies the operation of analytics applications that use the latest open source frameworks, such as Apache Spark...

Capacity Management and Amazon EMR Managed Scaling improvements for Amazon EMR on EC2 clusters | Amazon Web Services

In 2022, we told you about the new enhancements we made in Amazon EMR Managed Scaling, which helped improve cluster utilization as well as...

How Ontraport reduced data processing cost by 80% with AWS Glue | Amazon Web Services

This post is written in collaboration with Elijah Ball from Ontraport. Customers are implementing data and analytics workloads in the AWS Cloud to optimize...

Query your Apache Hive metastore with AWS Lake Formation permissions | Amazon Web Services

Apache Hive is a SQL-based data warehouse system for processing highly distributed datasets on the Apache Hadoop platform. There are two key components to...

How Zoom implemented streaming log ingestion and efficient GDPR deletes using Apache Hudi on Amazon EMR | Amazon Web Services

In today’s digital age, logging is a critical aspect of application development and management, but efficiently managing logs while complying with data protection regulations...

Interact with Apache Iceberg tables using Amazon Athena and cross account fine-grained permissions using AWS Lake Formation

We recently announced support for AWS Lake Formation fine-grained access control policies in Amazon Athena queries for data stored in any supported file format...

Accelerate time to insight with Amazon SageMaker Data Wrangler and the power of Apache Hive

Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes in Amazon...

A Dive into Apache Flume: Installation, Setup, and Configuration

Introduction Apache Flume is a tool/service/data ingestion mechanism for gathering, aggregating, and delivering huge amounts of streaming data from diverse sources, such as log files,...

Top 20 Big Data Tools Used By Professionals in 2023

Introduction Big Data is a large and complex dataset generated by various sources and grows exponentially. It is so extensive and diverse that traditional data...

Achieve up to 27% better price-performance for Spark workloads with AWS Graviton2 on Amazon EMR Serverless

Amazon EMR Serverless is a serverless option in Amazon EMR that makes it simple to run applications using open-source analytics frameworks such as Apache...

Latest Intelligence

spot_img
spot_img