Tag: spark sql

Use your corporate identities for analytics with Amazon EMR and AWS IAM Identity Center | Amazon Web Services

Big Data April 26, 2024

To enable your workforce users for analytics with fine-grained data access controls and audit data access, you might have to create multiple AWS Identity...

Optimize data layout by bucketing with Amazon Athena and AWS Glue to accelerate downstream queries | Amazon Web Services

Big Data April 25, 2024

Run interactive workloads on Amazon EMR Serverless from Amazon EMR Studio | Amazon Web Services

Big Data April 24, 2024

7 Steps to Mastering Data Engineering – KDnuggets

Big Data April 12, 2024

Working with Window Functions in PySpark

Big DataMarch 27, 2024

Introduction Learning about Window Functions in PySpark can be challenging but worth the effort. Window Functions are a powerful tool for analyzing data and can...

5 Free Courses to Master SQL for Data Science – KDnuggets

Big DataMarch 6, 2024

Image by Editor SQL is a must-have skill for all data professionals. But achieving mastery in SQL is a continuous journey. Here we’ve compiled a...

Mastering market dynamics: Transforming transaction cost analytics with ultra-precise Tick History – PCAP and Amazon Athena for Apache Spark | Amazon Web Services

Big DataJanuary 31, 2024

This post is cowritten with Pramod Nayak, LakshmiKanth Mannem and Vivek Aggarwal from the Low Latency Group of LSEG. ...

The Only Free Course You Need To Become a Professional Data Engineer – KDnuggets

Big DataJanuary 26, 2024

Image by Author There are many courses and resources available on machine learning and data science, but very few on data engineering. This raises...

Use Amazon Athena with Spark SQL for your open-source transactional table formats | Amazon Web Services

Big DataJanuary 24, 2024

AWS-powered data lakes, supported by the unmatched availability of Amazon Simple Storage Service (Amazon S3), can handle the scale, agility, and flexibility required to...

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation | Amazon Web Services

Big DataJanuary 17, 2024

With Amazon EMR 6.15, we launched AWS Lake Formation based fine-grained access controls (FGAC) on Open Table Formats (OTFs), including Apache Hudi, Apache Iceberg,...

Run Spark SQL on Amazon Athena Spark | Amazon Web Services

Big DataOctober 23, 2023

At AWS re:Invent 2022, Amazon Athena launched support for Apache Spark. With this launch, Amazon Athena supports two open-source query engines: Apache Spark and...

Migrate your existing SQL-based ETL workload to an AWS serverless ETL infrastructure using AWS Glue | Amazon Web Services

Big DataJuly 31, 2023

Data has become an integral part of most companies, and the complexity of data processing is increasing rapidly with the exponential growth in the...

12 Page 1 of 2

Latest Intelligence

10 Best Data Analytics Projects

Big Data May 21, 2023

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

Big Data April 24, 2023

Amazon EMR on EKS widens the performance gap: Run Apache Spark workloads 5.37 times faster and at 4.3 times lower cost

Big Data April 12, 2023

Implement column-level encryption to protect sensitive data in Amazon Redshift with AWS Glue and AWS Lambda user-defined functions

Big Data April 5, 2023

Generative Data Intelligence