Connect with us

Big Data

5 Python Data Processing Tips & Code Snippets

Published

on

5 Python Data Processing Tips & Code Snippets

This is a small collection of Python code snippets that a beginner might find useful for data processing.


Figure
Photo by Hitesh Choudhary on Unsplash

 

This article contains 5 useful Python code snippets that a beginner might find helpful for data processing.

Python is a flexible, general purpose programming language, providing for many ways to approach and achieve the same task. These snippets shed light on one such approach for a given situation; you might find them useful, or find that you have come across another approach that makes more sense to you.

1. Concatenating Multiple Text Files

 
Let’s start with concatenating multiple text files. Should you have a number of text files in a single directory you need concatenated into a single file, this Python code will do so.

First we get a list of all the txt files in the path; then we read in each file and write out its contents to the new output file; finally, we read the new file back in and print its contents to screen to verify.

import glob # Load all txt files in path
files = glob.glob('/path/to/files/*.txt') # Concatenate files to new file
with open('2020_output.txt', 'w') as out_file: for file_name in files: with open(file_name) as in_file: out_file.write(in_file.read()) # Read file and print
with open('2020_output.txt', 'r') as new_file: lines = [line.strip() for line in new_file]
for line in lines: print(line)


file 1 line 1
file 1 line 2
file 1 line 3
file 2 line 1
file 2 line 2
file 2 line 3
file 3 line 1
file 3 line 2
file 3 line 3


2. Concatenating Multiple CSV Files Into a DataFrame

 
Staying with the theme of file concatenation, this time let’s tackle concatenating a number of comma separated value files into a single Pandas dataframe.

We first get a list of the CSV files in our path; then, for each file in the path, we read the contents into its own dataframe; afterwards, we combine all dataframes into a single frame; finally, we print out the results to inspect.

import pandas as pd
import glob # Load all csv files in path
files = glob.glob('/path/to/files/*.csv') # Create a list of dataframe, one series per CSV
fruit_list = []
for file_name in files: df = pd.read_csv(file_name, index_col=None, header=None) fruit_list.append(df) # Create combined frame out of list of individual frames
fruit_frame = pd.concat(fruit_list, axis=0, ignore_index=True) print(fruit_frame)


 0 1 2
0 grapes 3 5.5
1 banana 7 6.8
2 apple 2 2.3
3 orange 9 7.2
4 blackberry 12 4.3
5 starfruit 13 8.9
6 strawberry 9 8.3
7 kiwi 7 2.7
8 blueberry 2 7.6


3. Zip & Unzip Files to Pandas

 
Let’s say you are working with a Pandas dataframe, such as the resulting frame in the above snippet, and want to compress the frame directly to file for storage. This snippet will do so.

First we will create a dataframe to use with our example; then we will compress and save the dataframe directly to file; finally, we will read the frame back into a new frame directly from compressed file and print out for verificaiton.

import pandas as pd # Create a dataframe to use
df = pd.DataFrame({'col_A': ['kiwi', 'banana', 'apple'], 'col_B': ['pineapple', 'grapes', 'grapefruit'], 'col_C': ['blueberry', 'grapefruit', 'orange']}) # Compress and save dataframe to file
df.to_csv('sample_dataframe.csv.zip', index=False, compression='zip')
print('Dataframe compressed and saved to file') # Read compressed zip file into dataframe
df = pd.read_csv('sample_dataframe.csv.zip',)
print(df)


Dataframe compressed and saved to file col_A col_B col_C
0 kiwi pineapple blueberry
1 banana grapes grapefruit
2 apple grapefruit orange


4. Flatten Lists

 
Perhaps you have a situation where you are working with a list of lists, that is, a list in which all of its elements are also lists. This snippet will take this list of embedded lists and flatten it out to one linear list.

First we will create a list of lists to use in our example; then we will use list comprehensions to flatten the list in a Pythonic manner; finally, we print the resulting list to screen for verification.

# Create of list of lists (a list where all of its elements are lists)
list_of_lists = [['apple', 'pear', 'banana', 'grapes'], ['zebra', 'donkey', 'elephant', 'cow'], ['vanilla', 'chocolate'], ['princess', 'prince']] # Flatten the list of lists into a single list
flat_list = [element for sub_list in list_of_lists for element in sub_list] # Print both to compare
print(f'List of lists:n{list_of_lists}')
print(f'Flattened list:n{flat_list}')


List of lists:
[['apple', 'pear', 'banana', 'grapes'], ['zebra', 'donkey', 'elephant', 'cow'], ['vanilla', 'chocolate'], ['princess', 'prince']] Flattened list:
['apple', 'pear', 'banana', 'grapes', 'zebra', 'donkey', 'elephant', 'cow', 'vanilla', 'chocolate', 'princess', 'prince']


5. Sort List of Tuples

 
This snippet will entertain the idea of sorting tuples based on specified element. Tuples are an often overlooked Python data structure, and are a great way to store related pieces of data without using a more complex structure type.

In this example, we will first create a list of tuples of size 2, and fill them with numeric data; next we will sort the pairs, separately by both first and second elements, printing the results of both sorting processes to inspect the results; finally, we will extend this sorting to mixed alphanumeric data elements.

# Some paired data
pairs = [(1, 10.5), (5, 7.), (2, 12.7), (3, 9.2), (7, 11.6)] # Sort pairs by first entry
sorted_pairs = sorted(pairs, key=lambda x: x[0])
print(f'Sorted by element 0 (first element):n{sorted_pairs}') # Sort pairs by second entry
sorted_pairs = sorted(pairs, key=lambda x: x[1])
print(f'Sorted by element 1 (second element):n{sorted_pairs}') # Extend this to tuples of size n and non-numeric entries
pairs = [('banana', 3), ('apple', 11), ('pear', 1), ('watermelon', 4), ('strawberry', 2), ('kiwi', 12)]
sorted_pairs = sorted(pairs, key=lambda x: x[0])
print(f'Alphanumeric pairs sorted by element 0 (first element):n{sorted_pairs}')


Sorted by element 0 (first element):
[(1, 10.5), (2, 12.7), (3, 9.2), (5, 7.0), (7, 11.6)] Sorted by element 1 (second element):
[(5, 7.0), (3, 9.2), (1, 10.5), (7, 11.6), (2, 12.7)] Alphanumeric pairs sorted by element 0 (first element):
[('apple', 11), ('banana', 3), ('kiwi', 12), ('pear', 1), ('strawberry', 2), ('watermelon', 4)]


And there you have 5 Python snippets which may be helpful to beginners for a few different data processing tasks.

 
Related:


PlatoAi. Web3 Reimagined. Data Intelligence Amplified.

Click here to access.

Source: https://www.kdnuggets.com/2021/07/python-tips-snippets-data-processing.html

Big Data

U.K.’s Wise to join the New Payments Platform in Australia

Published

on

SYDNEY (Reuters) – Digital money transfer group Wise Plc will join an Australian payments network which should allow transfers to be settled in the country faster and at lower cost, Chief Executive Officer Kristo Kaarmann said on Friday.

The company will become a direct participant and shareholder in Australia’s New Payments Platform (NPP), Kaarmann said.

Wise said that joining the NPP will allow it to lower its average price of money transfers in or out of Australia by bypassing middlemen to clear and settle real-time payments instantly.

Kaarmann did not say how much lower its rates would be after joining the NPP. It charges about 0.56% on its Australian transfers currently, the company said.

That compares with the average 5% to 6% the country’s major banks charge, according to Wise’s calculations.

The 10-year old financial technology company is regulated in Britain, the United States, Singapore, and among others, Australia, where it also holds a banking licence. But in many of the over 80 countries where it offers remittances, Wise partners with banks to hold deposits, which increases its costs and prices.

“Our average cost … is already many multiples cheaper than the banks,” London-based Kaarmann said. “We want to get as close to zero as possible, in terms of cost.”

The firm, whose market debut in July became the London stock exchange’s largest ever tech listing, estimates it handles about 1% to 2% of transfers by consumers and small and medium-sized businesses globally.

(Reporting by Paulina Duran in Sydney; Editing by Christian Schmollinger)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://datafloq.com/read/uks-wise-join-new-payments-platform-australia/18166

Continue Reading

Big Data

Google to slash amount it keeps from sales on its cloud marketplace- CNBC

Published

on

(Reuters) -Alphabet Inc’s Google will take a smaller cut when customers buy software from other vendors on its cloud marketplace, CNBC reported on Sunday.

The Google Cloud Platform is cutting its percentage revenue share to 3% from 20%, CNBC said, citing a person familiar with the matter. https://cnb.cx/2XZp7ep

“Our goal is to provide partners with the best platform and most competitive incentives in the industry. We can confirm that a change to our Marketplace fee structure is in the works and we’ll have more to share on this soon,” a Google Cloud spokesperson said in a statement to Reuters.

Earlier this year, Google cut the service fee it charges developers on its app store by half on the first $1 million they earn in revenue in a year.

(Reporting by Juby Babu in Bengaluru; Editing by Daniel Wallis)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://datafloq.com/read/google-slash-amount-keeps-sales-cloud-marketplace-cnbc/18165

Continue Reading

Big Data

Canada foreign minister says eyes wide open when it comes to normalizing China ties

Published

on

TORONTO (Reuters) – Canada’s “eyes are wide open” when it comes to normalizing its relationship with China, Foreign Minister Marc Garneau said on Sunday, after three years of rocky ties with Beijing since the arrest and Friday’s release of a Huawei Technologies executive.

Garneau told CBC News the government is now following a four-fold approach to China: “coexist,” “compete,” “co-operate,” and “challenge.”

Huawei CFO Meng Wanzhou, the daughter of Huawei founder Ren Zhengfei, flew back to China after reaching an agreement with U.S. prosecutors to end a bank fraud case against her. That resulted in the scrapping of her nearly three year extradition battle in a Canadian court.

Soon after Meng flew to China, Michael Kovrig and Michael Spavor – the two Canadians detained by Chinese authorities just days after Meng’s arrest in December 2018 – were released by Beijing.

“There was no path to a relationship with China as long as the two Michaels were being detained,” Garneau said.

Prime Minister Justin Trudeau and Garneau received the two Canadians on Saturday when they arrived in the western Canadian city of Calgary after spending more than 1,000 days in solitary confinement.

Trudeau, who won a third term on Sept. 20 election after a tight race, had vowed to improve ties with China since he first became prime minister in 2015, building on his father’s success in establishing diplomatic ties with China in 1970.

But even before Meng’s arrest, Canada’s repeated questioning of China’s human rights issues has irked Beijing, and the two countries have failed to come closer.

China has always denied any link between Meng’s extradition case and the detention of the two Canadians, but Garneau said, “the immediate return of the two Michaels linked” it to Meng’s case in a “very direct manner.”

Garneau also said he didn’t think the timing of the men’s return had anything to do with the timing of the federal election.

“I think it just worked out that way.”

(Reporting by Denny Thomas; editing by Grant McCool)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://datafloq.com/read/canada-foreign-minister-says-eyes-wide-open-comes-normalizing-china-ties/18164

Continue Reading

Big Data

EU says U.S. trade, tech council to boost its clout, set rules for 21st century

Published

on

By Foo Yun Chee

BRUSSELS (Reuters) – The U.S.-EU Trade and Technology Council (TTC) will give Europe more clout and set standards and rules for the 21st century, the EU’s trade and digital chiefs said, underscoring global concerns about China’s growing power.

The comments by Valdis Dombrovskis and Margrethe Vestager came ahead of the first TTC meeting in Pittsburgh on Wednesday and as the United States and Europe face off with China in areas ranging from trade to defence to technology and human rights.

“There is real strategic and geopolitical importance to this new platform as a way in setting standards and rules for the 21st century. So we need this Council to amplify our status,” Dombrovskis told reporters.

Dombrovskis insisted, however, that the platform was not targeted at any particular country.

“TTC is not about any specific third country, it is about cooperation and coordination on a number of policy areas between the United States and the EU,” he said.

The Council’s 10 working groups will focus on technology standards, green technology, supply-chain security, data governance, export controls, investment screening and global trade issues, among others.

All these areas are key for the EU, Vestager said.

“What we have achieved is a package that covers, I think, both offensive and defensive interests,” she told reporters.

Dombrovskis said French fury at Australia’s decision to scrap a $40 billion submarine deal for one with the United States and Britain should not deflect the EU from its long-term interests.

“We are allies, partners and friends, and yes friends can easily from time to time make mistakes and we have seen this in recent weeks, but we knew this issue should not cloud our judgment on our strategic alliances,” he said.

U.S. Secretary of State Antony Blinken, Commerce Secretary Gina Raimondo and Trade Representative Katherine Tai will be the co-chairs of the meeting with Dombrovskis and Vestager.

The EU hopes to hold a second meeting next spring in Belgium.

(Reporting by Foo Yun Chee; Editing by Catherine Evans)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://datafloq.com/read/eu-says-us-trade-tech-council-boost-clout-set-rules-21st-century/18163

Continue Reading
Esports5 days ago

Can You Play Diablo II: Resurrected Offline?

Esports5 days ago

Failed to Enter Game, Character Could Not be Found: How to Fix Error in Diablo II: Resurrected

Esports3 days ago

Fall Guys achieves Guinness World Record for most downloaded PlayStation Plus game ever

Esports5 days ago

Valkyrae says YouTube is working on gifted members and a feature similar to Twitch Prime

Esports5 days ago

Valkyrae says YouTube is working on gifted members and a feature similar to Twitch Prime

Esports1 day ago

Twitch celebrity meetup Sh*tCamp 2021 begins today

Esports4 days ago

Microsoft’s The Initiative brings on Crystal Dynamics to help develop its Perfect Dark reboot

Esports5 days ago

How to check Diablo 2: Resurrected server status

Esports4 days ago

Best Stats for the Druid in Diablo II: Resurrected

Esports3 days ago

NBA 2K22 ‘Meet the Designers’ Quest Guide: How to Complete

Esports5 days ago

How to play with friends in Diablo 2: Resurrected

Esports5 days ago

Failed to Enter Game, Character Could Not be Found: How to Fix Error in Diablo II: Resurrected

Esports1 day ago

FIFA 22 Early Access Pack: How to Get

Esports4 days ago

NBA 2K22 Current Gen Best Big Man Build: How to Make

Esports4 days ago

How to earn operation stars in CS:GO

Esports4 days ago

Valorant Patch 3.07 Release Date: When is it?

Esports3 days ago

Tools of the Trade Diablo II: Resurrected Quest Guide

Esports4 days ago

NBA 2K22 Next Gen Best Big Man Build: How to Make

Esports4 days ago

XCOM 3 Appears in Nvidia Data Base Leak

Esports1 day ago

How to play Worlds 2021 Pick’em

Trending