Connect with us

Big Data

Different Type of Correlation Metrics Used by Data Scientists

Published

on

This article was published as a part of the Data Science Blogathon

Introduction

Before explaining the correlation and correlation metrics, I would like you to answer a simple question.

Let’s suppose you are the owner of a company that makes soft drinks. You have collected past one-year records which are the cost and sales of the product.

Now the question is, could you infer from the data whether the sales and cost have some relation between them. In other words, does low price helps in increasing the sales, or it does not affect sales at all ???

So we need some statistical tools that measure the relationship between variables.

Now you have an idea of what is the need for correlation.

Table of Contents

  • What is Covariance
  • What is Correlation Metrics
  • Types of Correlation Metrics
    1. Pearson Correlation
    2. Spearman’s Rank Correlation
    3. Kendall Rank Correlation
    4. Point Biserial Correlation

What is Covariance?

A covariance is a statistical tool that helps to quantify the total variance of random variables from their expected value(Mean). In simple words, it is a measure of the linear relationship between two random variables. It can take any positive and negative values.

  • Positive Covariance: It indicates that two variables tend to move in the same direction, which means that if we increase the value of one variable other variable value will also increase.
  • Zero Covariance: It indicates that there is no linear relationship between them.
  • Negative Covariance: It indicates that two variables tend to move in the opposite direction, which means that if we increase the value of one variable other variable value will decrease and vice versa.

Covariance between two variables X and Y can be calculated using the following formula:

covariance | Correlation metrics

xi = ith data point of x

x̅ = mean of x

yi = ith data point of y

 y̅ = mean of y

n = total number of data points

NOTE: Notice that while calculating population covariance, we use n in denominator, and while calculating sample covariance, we use n -1.

Limitations of Covariance

  1. Covariance magnitude does not signify the strength of their relationship, so what only matters is the sign, whether it is positive or negative which tells the relationship.
  2. If we convert or scale the measurements of the variable X and Y, then Cov(X’, Y’) ≠ Cov(X, Y) should not happen.
  3. Covariance does not capture the non-linear relationship between two variables.

Now let’s calculate the Covariance between two variables using the python library.

Importing the necessary modules

import numpy as np

Generating random dataset which is normally distributed

a = np.random.rand(10)

b = np.random.rand(10)


Calculating Covariance between two variables

np.cov(a,b)

Output

covariance output

Here covariance value is -0.001, so we can say no linear relationship among them.

In order to quantify the strength of their relationship or how strongly they affect each other, we use Correlation.

What is the Correlation Metrics?

Correlation also measures the relationship between two variables as well as its magnitude defines the strength between variables. It ranges from -1 to 1 and is usually denoted by r.

  • Perfectly Positive Correlation: When correlation value is exactly 1.
  • Positive Correlation: When correlation value falls between 0 to 1.
  • No Correlation: When correlation value is 0.
  • Negative Correlation: When correlation value falls between -1 to 0.
  • Perfectly Negative Correlation: When correlation value is exactly -1.

The following figure illustrates the linear relationship graphically

linear relationships

Types of Correlation Metrics

  • Pearson Correlation
  • Spearman’s Rank Correlation
  • Kendall Rank Correlation
  • Point Biserial Correlation

Pearson Correlation

Pearson correlation is also known as the Pearson product-moment correlation coefficient and is a normalized measurement of the covariance. It also measures the linear relationship between two variables and fails to capture the non-linear relationship of two variables. Pearson correlation assumes that both variables are normally distributed. It can be used for nominal variables or continuous variables.

Pearson correlation coefficient between two variables X and Y can be calculated by the following formula:

pearson correlation metics

Limitation of Pearson Correlation

  • It fails to capture the non-linear relationship between two variables.
  • Usually, we do not use the Pearson correlation coefficient for ordinal variables(where sequence matters).

Now let us calculate the Pearson correlation coefficient between two variables using the python library.

Importing the necessary modules

from scipy.stats import pearsonr
import numpy as np

Generating random dataset which is normally distributed

a = np.random.normal(size=10)
b = np.random.normal(size=10)

Calculating Pearson Correlation Coefficient between two variables

pearsonr(a,b)

Output

output

Here Pearson Correlation is -0.05, so we can no linear relationship among them.

Spearman’s Rank Correlation

It is a nonparametric(no prior assumptions about distribution) measure for calculating correlation coefficient that is used for ordinal variables or continuous variables. Spearman’s rank correlation can capture both linear or non-linear relationships.

Spearman’s rank correlation coefficient between two variables X and Y can be calculated using the following formula:

spearman

Now let us calculate Spearman’s rank correlation coefficient between two variables using the python library.

Importing the necessary modules

import numpy as np
from scipy.stats import spearmanr

Generating random dataset which is normally distributed

a = np.random.rand(10)
b = np.random.rand(10)

Calculating Pearson Correlation Coefficient between two variables

spearmanr(a,b)

Output

Here Spearman’s Correlation is 0.15, so we can say positive correlation among them.

Kendall Rank Correlation

Kendell rank correlation, sometimes called Kendall tau coefficient, is a nonparametric measure for calculating the rank correlation of ordinals variables. It can also capture both linear or non-linear relationships between two variables. There are three different flavours of Kendall tau namely tau-a, tau-b, tau-c.

Generalized Kendall rank correlation coefficient between two variables X and Y can be calculated using the following formula:

kendall rank | Correlation metrics

Concordant Pair: A pair is concordant if the observed rank is higher on one variable and is also higher on another variable.

Discordant Pair: A pair is discordant if the observed rank is higher on one variable and is lower on the other variable.

cordent-discordent pair | Correlation metrics

Now let us calculate the Kendall tau correlation coefficient between two variables using the python library.

Importing the necessary modules

import numpy as np
from scipy.stats import kendalltau

Generating random dataset which is normally distributed

a = np.random.rand(10)
b = np.random.rand(10)

Calculating Pearson Correlation Coefficient between two variables

kendalltau(a,b)

Output

output

Here Kendall Correlation is -0.19, so we can say negative correlation among them.

Point Biserial Correlation

Point Biserial Correlation is used when one variable is dichotomous(binary) and another variable is continuous. It can also capture both linear or non-linear relationships between two variables. It is denoted by rpb.

Dichotomous Variable: If a variable can have only binary values like head or tail, male or female then such variable is called a dichotomous variable.

Point Biserial correlation coefficient between two variables X and Y can be calculated using the following formula:

point bisreal Correlation metrics

Now let us calculate the Point Biserial correlation coefficient between two variables using the python library.

Importing the necessary modules

import numpy as np
from scipy.stats import pointbiserialr

Generating random dataset which is normally distributed

a = np.random.rand(10)
b = np.random.rand(10)

Calculating Pearson Correlation Coefficient between two variables

pointbiserialr(a,b)

Output

output

Here Point Biserial Correlation is 0.305, so we can say positive correlation among them.

End Notes

I hope you enjoyed reading the article. If you found it useful, please share it among your friends and on social media. For any queries, suggestions, or any other discussion, please ping me here in the comments or contact me via Email or LinkedIn.

Contact me on LinkedIn – www.linkedin.com/in/ashray-saini-2313b2162

Contact me on Email – [email protected]

The media shown in this article on correlation metrics are not owned by Analytics Vidhya and are used at the Author’s discretion.

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://www.analyticsvidhya.com/blog/2021/09/different-type-of-correlation-metrics-used-by-data-scientist/

Big Data

VW’s 9-month electric vehicle deliveries to China more than triple

Published

on

FRANKFURT (Reuters) – Volkswagen’s deliveries of battery-powered electric vehicles to China more than tripled in the first nine months of the year, the carmaker said on Friday, less than two months after it flagged the need to change its e-car strategy there.

Deliveries of battery electric vehicles (BEV) to the world’s largest car market stood at 47,200 in the January-September period, up from 15,700 in the same period last year.

“As planned, we significantly accelerated the BEV market ramp-up in China in the third quarter, and we are on track to meet our target for the year of delivering 80,000 to 100,000 vehicles of the ID. model family,” Christian Dahlheim, head of group sales, said.

Volkswagen Chief Executive Herbert Diess in July said the carmaker had to change its approach to how it markets its BEVs in China after first-half deliveries stood at just 18,285.

(Reporting by Christoph Steitz; Editing by Maria Sheahan)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://datafloq.com/read/vws-9-month-electric-vehicle-deliveries-china-triple/18644

Continue Reading

Big Data

From spy satellites to mobile networks, S.Korea hopes new rocket gets space programme off ground

Published

on

By Josh Smith

SEOUL (Reuters) – South Korea plans to test its first domestically produced space launch vehicle next week, a major step toward jumpstarting the country’s space programme and achieving ambitious goals in 6G networks, spy satellites, and even lunar probes.

If all goes well, the three-stage NURI rocket, designed by the Korea Aerospace Research Institute (KARI) to eventually put 1.5-ton payloads into orbit 600 to 800km above the Earth, will carry a dummy satellite into space on Thursday.

South Korea’s last such booster, launched in 2013 after multiple delays and several failed tests, was jointly developed with Russia.

The new KSLV-II NURI has solely Korean rocket technologies, and is the country’s first domestically built space launch vehicle, said Han Sang-yeop, director of KARI’s Launcher Reliability Safety Quality Assurance Division.

“Having its own launch vehicle gives a country the flexibility of payload types and launch schedule,” he told Reuters in an email.

MILITARY AND CIVILIAN BENEFITS

It also gives the country more control over “confidential payloads” it may want to send into orbit, Han said.

That will be important for South Korea’s plans to launch surveillance satellites into orbit, in what national security officials have called a constellation of “unblinking eyes” to monitor North Korea.

So far, South Korea has remained almost totally reliant on the United States for satellite intelligence on its northern neighbour.

In 2020 a Falcon 9 rocket from the U.S. firm Space X carried South Korea’s first dedicated military communications satellite into orbit from the Kennedy Space Center in Florida.

NURI is also key to South Korean plans to eventually build a Korean satellite-based navigation system and a 6G communications network.

“The program is designed not only to support government projects, but also commercial activity,” Oh Seung-hyub, director of the Launcher Propulsion System Development Division, told a briefing on Tuesday.

South Korea is working with the United States on a lunar orbiter, and hopes to land a probe on the moon by 2030.

TRIAL LAUNCH

Given problems with previous launches, Han and other planners said they have prepared for the worst.

The launch day may be changed at the last minute if weather or technical problems arise; the craft will carry a self-destruct mechanism to destroy it if it appears it won’t reach orbit; and media won’t be allowed to observe the test directly.

At least four test launches are planned before the rocket will be considered reliable enough to carry a real payload.

According to pre-launch briefing slides, the rocket’s planned path will take it southeast from its launch site on the south coast of the Korean peninsula, threading its way over the ocean on a trajectory aimed at avoiding flying over Japan, Indonesia, the Philippines, and other major land masses.

“This upcoming launch may be remembered as the hope and achievement of Korean rocketry historically no matter the launch is successful or not,” Han told Reuters.

SENSITIVE TECHNOLOGY

Space rockets on the Korean peninsula have been fraught with concerns over their potential use for military purposes, leaving South Korea’s efforts lagging more capable programmes in China and Japan.

“Modern rocketry in Korea couldn’t devote its capability much in R&D of rockets because of long-standing political issues,” Han said.

The United States has viewed North Korea’s own satellite launch vehicles as testbeds for nuclear-tipped intercontinental ballistic missile technology. A North Korean space launch in 2012 helped lead to the breakdown of a deal with the United States.

“North Korea, of course, will not look favourably on South Korea’s rapidly advancing space capabilities, which are far more technologically advanced than those possessed by the North,” said James Clay Moltz, a space systems expert at the U.S. Naval Postgraduate School.

South Korea’s push into space comes as it speeds ahead with its own military ballistic missile systems after agreeing with the United States this year to end all bilateral restrictions on them.

“There is no concern on military applications in NURI launch vehicle development,” said Chang Young-keun, a missile expert at the Korea Aerospace University. Unlike the liquid-fuelled NURI, South Korea’s military missiles use solid fuel, which is better for weapons, he added.

South Korea is not seen as a “threat” by either Russia or China, so it seems unlikely to affect their space programs, which are already highly militarized, Moltz said.

“Many space launch technologies are inherently dual-use,” he said, but noted that he hopes NURI’s development will “not lead to an arms race in space, but instead a safer ‘information race’” where South Korea has better intelligence to head off any future crisis.

(Reporting by Josh Smith. Editing by Gerry Doyle)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://datafloq.com/read/from-spy-satellites-mobile-networks-skorea-hopes-new-rocket-gets-space-programme-ground/18637

Continue Reading

Big Data

Fourteen U.S. state attorneys general press Facebook on vaccine disinformation

Published

on

By Nandita Bose

WASHINGTON (Reuters) -The attorneys general of 14 U.S. states sent a letter to Facebook Inc Chief Executive Mark Zuckerberg asking if the top disseminators of vaccine disinformation on the platform received special treatment from the company.

The line of inquiry was generated after Facebook whistleblower Frances Haugen used internal documents to disclose that the social media platform has built a system that exempts high-profile users from some or all of its rules.

In the letter, which was sent on Wednesday, the 14 Democratic attorneys general said they are “extremely concerned” with recent reports that Facebook maintained lists of members who have received special treatment, and want to know if the “Disinformation Dozen” were part of those lists.

The Center for Countering Digital Hate describes the “Disinformation Dozen” as 12 anti-vaxxers who are responsible for almost two-thirds of anti-vaccine content circulating on social media platforms.

Facebook spokesman Alex Burgos pointed to earlier comments by the company that it had removed over three dozen pages, groups and Facebook or Instagram accounts linked to those 12 people, including at least one linked to each of the 12, for violating its policies. It has also applied penalties to some of their website domains.

COVID-19 disinformation has proliferated during the pandemic on social media sites including Facebook, Twitter Inc and Alphabet Inc’s YouTube. Researchers and lawmakers have long accused Facebook of failing to police harmful content on its platforms.

In July, President Joe Biden said social media platforms like Facebook “are killing people” for allowing misinformation about coronavirus vaccines to be posted on its platform. https://reut.rs/3iZ9ZVC

Haugen, a former product manager on Facebook’s civic misinformation team, left the nearly $1 trillion company with tens of thousands of confidential documents and has called for transparency about how Facebook entices users to keep scrolling, creating ample opportunity for advertisers to reach them.

The letter was sent by the attorneys general of Connecticut, California, Delaware, Illinois, Iowa, Maine, Massachusetts, Michigan, Minnesota, Maryland, Pennsylvania, Rhode Island, Vermont and Virginia.

(Reporting by Nandita Bose in Washington; Editing by Peter Cooney)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://datafloq.com/read/fourteen-us-state-attorneys-general-press-facebook-vaccine-disinformation/18636

Continue Reading

Big Data

Rubio calls on Biden administration to blacklist Huawei spin-off Honor

Published

on

By Alexandra Alper

WASHINGTON (Reuters) -Republican U.S. senators led by Marco Rubio on Thursday called on the Biden administration to blacklist Honor, a former unit of embattled Chinese telecoms giant Huawei, describing the firm as a threat to national security.

In a letter dated Thursday, seen by Reuters, Rubio described Honor as essentially an “arm” of the Chinese government with newly unfettered access the same prized U.S. technology currently denied to Huawei. The letter adds to a growing chorus of China hawks calling for the blacklisting.

By spinning off the Chinese telecom giant’s budget smartphone brand in November 2020, “Beijing has effectively dodged a critical American export control,” Rubio wrote in the letter also signed by Senators John Cornyn and Rick Scott.

“By failing to act in response, the Department of Commerce risks setting a dangerous precedent and communicating to adversaries that we lack the capacity or willpower to punish blatant financial engineering by an authoritarian regime.”

Honor and the Department of Commerce in Washington did not immediately respond to requests for comment.

Huawei declined to comment beyond noting a prior statement that said it would not hold any shares or be involved in managing Honor following the spinoff. The Chinese Embassy in Washington said the U.S. had kept “smearing” Huawei without presenting solid evidence to support its accusations.

The Trump administration placed Huawei on a trade blacklist in 2019, arguing the company posed a national security threat, which Huawei denies. Putting the company on the so-called entity list has meant its U.S. suppliers have had to obtain special licenses to sell key items like semiconductors to the firm.

Google was also barred from providing technical support to new Huawei phone models and access to Google Mobile Services, the bundle of developer services upon which most Android apps are based.

As sanctions against the company began to bite amid tighter controls, Huawei announced the Honor sale to a consortium of over 30 agents and dealers.

In August a group of 14 Republican Congressmen led by Michael McCaul, the ranking member of the House Foreign Affairs Committee, also called on the Commerce Department to blacklist Honor, alleging the company was spun off to evade U.S. export controls and to give Huawei access to blocked semiconductor chips and software.

On Monday, Honor said on Twitter https://twitter.com/Honorglobal/status/1447547538748280834 it had “succeeded in confirming cooperation with a number of supplier partners in the early stage” and that its Honor 50 smartphones would be equipped with Google Mobile Services.

(Reporting by Alexandra Alper; Additional Reporting by Brenda Goh in Shanghai; Editing by Diane Craft and William Mallard)

Image Credit: Reuters

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://datafloq.com/read/rubio-calls-biden-administration-blacklist-huawei-spin-off-honor/18635

Continue Reading
Esports4 days ago

Is Disco Elysium on Xbox Game Pass?

Esports3 days ago

How to Turn Off Game Chat in Back 4 Blood

Esports3 days ago

How to complete Grealish The Citizen SBC in FIFA 22 Ultimate Team

Esports1 day ago

Ill Gotten Gain New World: How to Obtain the Earring

Esports2 days ago

What is Bullet Stumble in Back 4 Blood?

Cyber Security3 days ago

Mbasic Facebook

Esports5 days ago

How to Feed Pelicans in Far Cry 6

Esports3 days ago

FIFA 22 Road to the Knockouts Promotion Revealed

Esports4 days ago

How to get Spectral and Manifested Pages in Destiny 2’s Festival of the Lost

Esports3 days ago

How to complete FUT United Set SBC in FIFA 22 Ultimate Team

Cyber Security3 days ago

CrowdStrike Launching a Free Community Edition of Humio and Falcon XDR

Esports4 days ago

XQc plays ‘Cuckold Simulator’ game after misogyny allegations

Energy5 days ago

Cuprum Coin : L’une des crypto-monnaies les plus précieuses au monde lancée avec succès

Esports2 days ago

How to Unlock All Characters in Demon Slayer: The Hinokami Chronicles

Esports3 days ago

Rainbow 6 commentator KiXSTaR has died, according to family

Esports1 day ago

How to Increase Your Corporate Level in NBA 2K22 MyCareer

Esports3 days ago

How to fix Back 4 Blood ‘Failed to Sign In’ error

Esports3 days ago

Rainbow Six Siege Pro League Caster Michael “KiXSTAr” Stockley Passes Away at 24

Fintech5 days ago

Dimon sees crypto headed toward U.S. regulation: IIF update

Energy2 days ago

Foodservice Disposables Market in the US to grow by USD 1.40 bn from 2020 to 2024|Technavio

Trending