Skip to content

Home / Statistical Tools / Analysis Tools / Correlation and Covariance / Math Details

Math Details

This page gives the exact formulas Quantum XL uses to compute the correlation and covariance analysis. Each equation lists what it computes and where it appears in the output.

Notation

Term Description
\(x_i, y_i\) the \(i\)-th paired values of two variables
\(n\) number of paired observations
\(\bar{x}, \bar{y}\) arithmetic means of \(x\) and \(y\)
\(R(x_i)\) the rank of \(x_i\) among the \(x\) values (ascending)

Pearson correlation coefficient

\[ r = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i - \bar{x})^2 \; \sum_{i=1}^{n} (y_i - \bar{y})^2}} \]

Used by: each off-diagonal cell of the Pearson correlation matrix (diagonal \(= 1\)). Returned as undefined if either variable has zero variance.

Spearman rank correlation coefficient

Spearman's \(\rho\) is the Pearson correlation applied to the ranks of the data:

\[ \rho = \frac{\sum_{i=1}^{n} \left(R(x_i) - \overline{R_x}\right)\left(R(y_i) - \overline{R_y}\right)}{\sqrt{\sum_{i=1}^{n} \left(R(x_i) - \overline{R_x}\right)^2 \; \sum_{i=1}^{n} \left(R(y_i) - \overline{R_y}\right)^2}} \]
Term Description
\(R(x_i)\) ascending rank of \(x_i\); tied values receive the average of the ranks they span
\(\overline{R_x}, \overline{R_y}\) mean ranks

Sample covariance

\[ \operatorname{cov}(x, y) = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{n - 1} \]

Uses the \(n-1\) (Bessel) denominator. On the diagonal of the covariance matrix this reduces to the sample variance of each variable.

Significance test (p-value)

For both Pearson and Spearman coefficients, the two-tailed p-value comes from the \(t\) statistic:

\[ t = r\sqrt{\frac{n - 2}{1 - r^2}}, \qquad \text{p-value} = 2\left[\,1 - T_{\,n-2}\!\left(\lvert t \rvert\right)\right] \]

where \(T_{n-2}\) is the Student's \(t\) CDF with \(n-2\) degrees of freedom. Edge cases: p-value \(= 0\) when \(\lvert r \rvert \ge 1\), p-value \(= 1\) when \(r = 0\), and undefined when \(n \le 2\). (Covariance is not accompanied by a p-value.)

See Also

References

  • Snedecor, G. W., & Cochran, W. G. (1989). Statistical Methods (8th ed.). Ames, IA: Iowa State University Press.
  • Rodgers, J. L., & Nicewander, W. A. (1988). Thirteen ways to look at the correlation coefficient. The American Statistician, 42(1), 59–66.
  • Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15(1), 72–101.
  • Kendall, M. G., & Gibbons, J. D. (1990). Rank Correlation Methods (5th ed.). London: Edward Arnold.
  • Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2005). Applied Linear Statistical Models (5th ed.). New York: McGraw-Hill/Irwin.