15 nov 2018

In statistics, the Pearson correlation coefficient, also referred to as the Pearson's r, Pearson product-moment correlation coefficient (PPMCC) or bivariate correlation, is a measure of the linear correlation between two variables X and Y. It has a value between +1 and âˆ’1, where 1 is total positive linear correlation, 0 is no linear correlation, and âˆ’1 is total negative linear correlation. It is widely used in the sciences. It was developed by Karl Pearson from a related idea introduced by Francis Galton in the 1880s.

Source: Wikipedia

\[r_{xy} = \frac{{COV}_{xy}}{S_xS_y}\] Where \(S\) is sthe standard deviation and \(COV\) is the covariance.

\[{COV}_{xy} = \frac{\sum_{i=1}^N (x_i - \bar{x})(y_i - \bar{y})}{N-1}\]

set.seed(565433) x = rnorm(10, 5) y = rnorm(10, 5) plot(x, y, las = 1) m.x = mean(x) m.y = mean(y) polygon(c(m.x,8,8,m.x),c(m.y,m.y,8,8), col = rgb(0,1,0,.5)) polygon(c(m.x,0,0,m.x),c(m.y,m.y,0,0), col = rgb(0,1,0,.5)) polygon(c(m.x,0,0,m.x),c(m.y,m.y,8,8), col = rgb(1,0,0,.5)) polygon(c(m.x,8,8,m.x),c(m.y,m.y,0,0), col = rgb(1,0,0,.5)) points(x,y) abline(h = m.y, lwd = 3) abline(v = m.x, lwd = 3) segments(x, m.y, x, y, col = "orange", lwd = 2) segments(x, y, m.x, y, col = "darkgreen", lwd = 2) text(m.x+.7, m.y+.7, "+ x +", cex = 2) text(m.x-.7, m.y-.7, "- x -", cex = 2) text(m.x+.7, m.y-.7, "+ x -", cex = 2) text(m.x-.7, m.y+.7, "- x +", cex = 2)

\[(x_i - \bar{x})(y_i - \bar{y})\]

n = 50 grade = rnorm(n, 6, 1.6) b.0 = 100 b.1 = .3 error = rnorm(n, 0, 0.7) IQ = b.0 + b.1 * grade + error #IQ = group(IQ) error = rnorm(n, 0, 0.7) motivation = 3.2 + .2 * IQ + error

grade = data$grade IQ = data$IQ mean.grade = mean(grade) mean.IQ = mean(IQ) N = length(grade) plot(data$grade, ylim=summary(c(data$grade, data$IQ))[c('Min.','Max.')], col='orange') points(data$IQ, col='blue')