CHAPTER ONE
BASIC PARAMETRIC CORRELATION
A single correlation coefficient represents a correlation matrix with a dimensionality of one squared. Before we consider more complex correlation matrices, it is fitting to begin with the understanding of a correlation matrix with the size of one. Such a correlation matrix has 2 dimensions and a 0 (X - 2) degrees of freedom.
Any two sets of data, as long as they are evenly aligned within a data-table, may be correlated with one another regardless of the relative difference of the magnitude or scale of the data--in short, practically anything can be correlated with anything else, as long as both can be numerically represented. The advantage of the correlation coefficient is that it allows us to readily compare differently scaled sets of data, the disadvantage is that correlations may be measuring relationships which do not in fact resist. Thus, strong correlations may exist between completely unrelated sets of data, though this is probably unusual, and weak correlations may exist between strongly determined relationships between two sets of data.
Basic correlation may be said to measure the degree of similarity between two sets of data--or rather the relative degree to which the two parallel and aligned sets of data points "move together" when plotted on a scattergram. A correlation ranges between the values of -1 and +1, with 0 indicating a complete lack of correlation, negative values indicating a inverse relationship between the data sets, such that when the values of one set tend to be high, the values of the other set tend to be low, and vice versa, and a positive correlation indicating that when one set of values is high, the corresponding values of the other set also tend to be relatively high. The higher the correlation, the more "similar" the two sets of data--the more the data points move together in what might be described as a linear relationship.

The Pearson product moment correlation coefficient is a general estimator of the strength of a linear relationship between two sets of data. It is a "dimensionless" index which relfects the extent of a linear relationship between two points.
The graph above demonstrates the linear relationships that are involved when there is a perfect positive correlation, a perfect negative correlation, a weak negative correlation, and no correlation at all. The perfect positive correlation can be seen to be a diagonal line of points passing through the origin. The correlation is perfect despite the fact that the magnitude of the y points are different from the x by a factor of .1. The perfect negative correlation is the same, but running in an orthogonal direction. The weak negative correlation shows a dispersion about the general line of the perfect negative correlation, with a few points drifting off in the opposite direction in the center. The set of points showing a lack of any linear relations shows no line at all.

The graph above shows the lines drawn through the respective points. The correlation matrix below shows the correlations derived from this distribution and graph.
|
X values |
Positive |
Negative |
None |
Weak Negative |
|
|
X values |
1 |
||||
|
Positive |
1 |
1 |
|||
|
Negative |
-1 |
-1 |
1 |
||
|
None |
0.074 |
0.074 |
-0.074 |
1 |
|
|
Weak Negative |
-0.8 |
-0.8 |
0.8028 |
0.033 |
1 |
It is closely related to least squares linear regression analysis (model 1 regression), in fact it is equal to the slope of the functional equation of the regression line of the sample multiplied by the ratio f the sample standard deviations of X and Y. The linear reqression equations for the data for these graphs was computed using the following formula:
![]()
where

and where
![]()
Least squares regression has different presumptions than the Pearson product moment correlation coefficient--the predictor variable X is measured without error, the two samples are homoscedastic, having the same variance, there must be some linear relationship between X and Y. While regression coefficients measure the form of the linear relationship between two sets of points, correlation coefficients measure the degree of dispersion about the regression axis. The population correlation coefficient always presumes a bivariate normal distribution of both X and Y, which is never the case with regression analysis in which values of X are predetermined. The difference between the correlation coefficient and linear regression is that with linear regression, the x-coordinates are known and fixed, while with correlation coefficient both the x and y coordinate values are not known and variable. Whenever the predictor or independent variable (X) is known or fixed by the experimenter, linear regression analysis can be used by the experimenter to describe the relationship between the abcissa and ordinate. In this case the predictor variable has no real "distribution" and no assumptions about the variance of X need be made--it is selected rather than sampled. While we may predictively plot points of x and y according to our equation, we cannot do the same on the basis of correlation coefficients alone--in fact we cannot guess either x nor y. It is possible that if we know one coordinate, we can determine the other, but we do not have this information to begin with in the use of correlation.
The population coefficient (
)
is defined as the square root of the Coefficient of Determination:
)
The Coefficient of Determination (
)
is the same as the coefficient of Dispersion (
) in linear regression, defined as:
![]()
The coefficient of Disperson is interpreted as the proportion of the variance in y that is attributable to the variance in x. The coefficient of Determination is therefore simply the inverse complement of the Coefficient of Nondetermination which represents the proportion of variability in the Y variable which remains unaccounted for by the relationship between X and Y:

In other words, the population correlation coefficient (or "rho") is expressed as:

The relations of the population correlation coefficient (p) to the sample correlationa coefficient (r) is that r is used to estimate p.The sum of standard deviations of the points is an indicator of the strength of a linear relationship. Dividing the sum of deviations by n expresses the average deviation of the points about X and Y, while dividing each individual deviation by the sample stndard deviation eliminates the relative units of analysis--hence the magnitude--of the scale of measurement.

or, replacing the sample standard deviation (S) for its formula:

This sample correlation coefficient (r) is an alternative algebraic form of estimating (p).
The sample correlation coefficient is related to the sample regression constants--it is the slope constant b multiplied by the ratio of the sample standard deviations of X and Y, which converts the slope into a dimensionless statement of correlation:
![]()
Because the values of X are distributed and undetermined in the sample, one
cannot infer p directly from r, though
is the square of the Pearson product moment correlation coefficient.
The first issues to be addressed in understanding and analyzing correlational matrices are the questions of significance--both theoretically and statistically. It is in attempting to answer this question systematically that we find the real strength and value of this form of analysis in hypothetically uniting qualitative and quantitive dimensions in the representation of complex realities.
The first question of significance of the correlation coefficent has to do with the significance of the individual scores, as determined by a t-test which fits the correlation, given the degrees of freedom, to a normally distributed curve:

I
n which (
) is assumed to be
equal to 0 and degress of freedom (df) is equal to (n-2) or the number of
dimensions of the matrix (X) minus 2. Either a one or two tailed test can be
used--one-tailed tests do not ignore the sign of the correlation coefficient and
so are used. A table of values assuming a two tailed test is given below:
|
df |
0.5 |
0.2 |
0.1 |
0.05 |
0.01 |
0.001 |
df |
|
1 |
1 |
3.078 |
6.314 |
12.706 |
63.657 |
636.619 |
1 |
|
2 |
0.816 |
1.886 |
2.92 |
4.303 |
9.925 |
31.598 |
2 |
|
3 |
0.765 |
1.638 |
2.353 |
3.182 |
5.841 |
12.924 |
3 |
|
4 |
0.741 |
1.533 |
2.132 |
2.776 |
4.604 |
8.61 |
4 |
|
5 |
0.727 |
1.476 |
2.015 |
2.571 |
4.032 |
6.869 |
5 |
|
6 |
0.718 |
1.44 |
1.943 |
2.447 |
3.707 |
5.959 |
6 |
|
7 |
0.711 |
1.415 |
1.895 |
2.365 |
3.499 |
5.408 |
7 |
|
8 |
0.706 |
1.397 |
1.86 |
2.306 |
3.355 |
5.041 |
8 |
|
9 |
0.703 |
1.383 |
1.833 |
2.262 |
3.25 |
4.781 |
9 |
|
10 |
0.7 |
1.372 |
1.812 |
2.228 |
3.169 |
4.587 |
10 |
|
11 |
0.697 |
1.363 |
1.796 |
2.201 |
3.106 |
4.437 |
11 |
|
12 |
0.695 |
1.356 |
1.782 |
2.179 |
3.055 |
4.318 |
12 |
|
13 |
0.694 |
1.35 |
1.771 |
2.16 |
3.012 |
4.221 |
13 |
|
14 |
0.692 |
1.345 |
1.761 |
2.145 |
2.977 |
4.14 |
14 |
|
15 |
0.691 |
1.341 |
1.753 |
2.131 |
2.947 |
4.073 |
15 |
|
16 |
0.69 |
0.1337 |
1.746 |
2.12 |
2.921 |
4.015 |
16 |
|
17 |
0.689 |
1.333 |
1.74 |
2.11 |
2.898 |
3.965 |
17 |
|
18 |
0.688 |
1.33 |
1.734 |
2.101 |
2.878 |
3.922 |
18 |
|
19 |
0.688 |
1.328 |
1.729 |
2.093 |
2.861 |
3.883 |
19 |
|
20 |
0.687 |
1.325 |
1.725 |
2.086 |
2.845 |
3.85 |
20 |
|
21 |
0.686 |
1.323 |
1.721 |
2.08 |
2.831 |
3.819 |
21 |
|
22 |
0.686 |
1.321 |
1.717 |
2.074 |
2.819 |
3.792 |
22 |
|
23 |
0.685 |
1.319 |
1.714 |
2.069 |
2.807 |
2.767 |
23 |
|
24 |
0.685 |
1.318 |
1.711 |
2.064 |
2.797 |
3.745 |
24 |
|
25 |
0.684 |
1.316 |
1.708 |
2.06 |
2.287 |
3.725 |
25 |
|
30 |
0.683 |
1.31 |
1.697 |
2.042 |
2.75 |
3.646 |
30 |
|
40 |
0.681 |
1.303 |
1.684 |
2.021 |
2.704 |
3.551 |
40 |
|
60 |
0.679 |
1.296 |
1.671 |
2 |
2.66 |
3.46 |
60 |
|
120 |
0.677 |
1.289 |
1.658 |
1.98 |
2.617 |
3.373 |
120 |
|
inf. |
0.674 |
1.282 |
1.645 |
1.96 |
2.576 |
3.291 |
inf. |
If one wishes to test whether p is equal to some value other than zero, then the correlation coefficient r must be converted to Z using Fisher's conversion formula:

Determining the confidence limits for Fishers conversion with p values other than 0 is more useful with natural data sets than the standard t-test assuming p = 0. Hypothesis which lie outside the confidence interval will be rejected, but rejection of the correlation coefficient on the basis of the hypothesis that H is equal to 0 may be misleading about the actual significance of the correlation coefficient. The conversion of standard correlation coefficients is given by the following table:
|
r |
Z |
r |
Z |
r |
Z |
r |
Z |
|
0 |
0 |
0.25 |
0.255 |
0.5 |
0.549 |
0.75 |
0.973 |
|
0.01 |
0.01 |
0.26 |
0.266 |
0.51 |
0.563 |
0.76 |
0.996 |
|
0.02 |
0.02 |
0.27 |
0.277 |
0.52 |
0.576 |
0.77 |
1.02 |
|
0.03 |
0.03 |
0.28 |
0.288 |
0.53 |
0.59 |
0.78 |
1.045 |
|
0.04 |
0.04 |
0.29 |
0.299 |
0.54 |
0.604 |
0.79 |
1.071 |
|
0.05 |
0.05 |
0.3 |
0.31 |
0.55 |
0.618 |
0.8 |
1.099 |
|
0.06 |
0.06 |
0.31 |
0.321 |
0.56 |
0.633 |
0.81 |
1.127 |
|
0.07 |
0.07 |
0.32 |
0.332 |
0.57 |
0.648 |
0.82 |
1.157 |
|
0.08 |
0.08 |
0.33 |
0.343 |
0.58 |
0.662 |
0.83 |
1.188 |
|
0.09 |
0.09 |
0.34 |
0.354 |
0.59 |
0.678 |
0.84 |
1.221 |
|
0.1 |
0.1 |
0.35 |
0.365 |
0.6 |
0.693 |
0.85 |
1.256 |
|
0.11 |
0.11 |
0.36 |
0.377 |
0.61 |
0.709 |
0.86 |
1.293 |
|
0.12 |
0.121 |
0.37 |
0.388 |
0.62 |
0.725 |
0.87 |
1.333 |
|
0.13 |
0.131 |
0.38 |
0.4 |
0.63 |
0.741 |
0.88 |
1.376 |
|
0.14 |
0.141 |
0.39 |
0.412 |
0.64 |
0.758 |
0.89 |
1.422 |
|
0.15 |
0.151 |
0.4 |
0.424 |
0.65 |
0.775 |
0.9 |
1.472 |
|
0.16 |
0.161 |
0.41 |
0.436 |
0.66 |
0.793 |
0.91 |
1.528 |
|
0.17 |
0.172 |
0.42 |
0.448 |
0.67 |
0.811 |
0.92 |
1.589 |
|
0.18 |
0.182 |
0.43 |
0.469 |
0.68 |
0.829 |
0.93 |
1.658 |
|
0.19 |
0.192 |
0.44 |
0.472 |
0.69 |
0.848 |
0.94 |
1.738 |
|
0.2 |
0.203 |
0.45 |
0.485 |
0.7 |
0.867 |
0.95 |
1.832 |
|
0.21 |
0.213 |
0.46 |
0.497 |
0.71 |
0.887 |
0.96 |
1.946 |
|
0.22 |
0.224 |
0.47 |
0.511 |
0.72 |
0.908 |
0.97 |
2.092 |
|
0.23 |
0.234 |
0.48 |
0.523 |
0.73 |
0.929 |
0.98 |
2.298 |
|
0.24 |
0.245 |
0.49 |
0.536 |
0.74 |
0.95 |
0.99 |
2.647 |
The sampling distribution of Z is approximately normal, with a standard error estimated by the formula:

The confidence limits are used when estimating the probability of p, and are found by the z-test at an arbitrary level of confidence (p) times the standard error:
Confidence interval = ![]()
In which the interval is finding the z value of the interval on the z conversion table which is determined by subtracting the arbitrary confidence value by 1 and then dividing the difference in half, and then multiplying this z value by the sample mean.
|
p < confidence value |
Larger Portion |
Smaller Portion |
y ordinate at z |
z |
|
0.5 |
0.5 |
0.5 |
0.3989 |
0 |
|
0.2 |
0.80223 |
0.1977 |
0.278 |
0.85 |
|
0.1 |
0.9015 |
0.0985 |
0.1736 |
1.29 |
|
0.095 |
0.9066 |
0.0934 |
0.1669 |
1.32 |
|
0.05 |
0.9505 |
0.0495 |
0.1023 |
1.65 |
|
0.02 |
0.9803 |
0.0197 |
0.0478 |
2.06 |
|
0.01 |
0.9901 |
0.0099 |
0.0264 |
2.33 |
|
0.005 |
0.9951 |
0.0059 |
0.0143 |
2.58 |
|
0.002 |
0.998 |
0.002 |
0.0063 |
2.88 |
|
0.001 |
0.999 |
0.001 |
.0035-33 |
3.08-3.1 |
|
0.0005 |
0.9995 |
0.0005 |
0.0017 |
3.3 |
|
0.0001 |
0.9999 |
0.0001 |
0.0004 |
3.7 |
The confidence interval about Z is therefore given as:
![]()
Fishers transformation (denoted as Z) of the correlation coefficient should not be confused with the z test (denoted as z) which measures the normal distribution of a point. The two values are entirely unrelated.
Fisher's transformation of the correlation coefficient must be used when comparing two independent correlation coefficients drawn from different samples. Both correlation coefficients must be converted using the formula above, and the standard error of the difference between these converted scores is given by the formula:

The standardized normal deviate for the difference between two Z scores is given as:

The hypothesis to be tested is whether there is a critical difference between the correlation coefficients, determined by comparing the resulting z value with the z score of the arbitrary confidence value p.
A separate but related question has to do with the question of relative variation of scores within the table--are individual scores significantly different from one another? The case may be imagined where two or more correlations are not individually significant in the external sense, but are significantly diffferent from one another. A correlational table may lack internally significant differences of pattern, and yet be externally coherent.
The possibility of cross-correlational analysis rests upon the following premises:
1. In general, all else being equal, a higher correlation is probably more significant than a lower correlation.
2. A negative and positive correlation of the same absolute value are of equal strength but possibly of very different theoretical significance.
3. A high correlation that is significant, especially of a large data set, suggests an underlying functional patterning.
4. a strong correlation may represent a functional relationship, but a weak, insignificant correlation represents a probable lack of a direct functional relationship.
5. a stronger correlation is more likely to represent such a relationship than a weaker one.
6. correlational sets of a matrix may be grouped and described statistically in a meaningful way--they form a distributed curve the characteristics of which (mode, median, mean, etc.) may be evaluated.
7. a correlational matrix can be reorganized in different ways, resulting in different descriptions of the resulting tables--these values can be compared (i.e. rows versus columns).
8. The rows and columns of a correlational matrix represent hypothetical dimensions of analysis which can be topically characterized.
9. In a complex and large correlational matrix, even low correlational values may be significant indicators of a functional relationship.
Blanket Copyright, Hugh M. Lewis, © 2005. Use of this text governed by fair use policy--permission to make copies of this text is granted for purposes of research and non-profit instruction only.
Last Updated: 04/19/05