Correlation Coefficient ( r )

Published by

on

A correlation coefficient is a statistical measure of the degree to which changes to the value of one variable predict change to the value of another.

Correlation coefficients are expressed as values between +1 and -1 .

Positive-Negative-No-Correlation

correlation-samplesPositively correlated variables – value increases or decreases in tandem. A coefficient of +1 indicates a perfect Positive Correlation.

A cloud of points around SD line slopes Up.

  • The more tightly clustered the points are along a line, the stronger the relationship between the variables, and the closer r is to 1.0.
  • When the correlation is near 1.0, knowing a point’s X value allows you to predict its Y value with very little error.
  • But that doesn’t mean that the Y value is the same or nearly the same as the X value, since the Y variable may be expressed in completely different units

Examples:

  • Hours spent studying and grade point averages.
  • Education and income levels.
  • Poverty and crime levels.

Negatively correlated variables –  value of one increases as the value of the other decreases. A coefficient of -1 indicates a perfect Negative Correlation.

A cloud of points around SD line slopes Down. As seen in above figure.

Examples:

  • Commodity supply and demand.
  • Pages printed and printer ink supply.
  • Education and religiosity.

No Correlation: A coefficient of zero indicates there is no discernible relationship between fluctuations of the variables.


Computing r , How-to

r = average ( x in STD units * y in STD Units )

See post on Standard Units for more info.


The connection between r and the typical distance above or below the SD line is given by

 √(2(1- |r|)) * Vertical SD

Example: if r = 0.95

√(2(1- 0.95)) , approx. 0.3.

So the spread around the SD line is about 30% of a vertical SD, when r = 0.95.