Pearson's
Product Moment Correlation Coefficient
Correlation
coefficients estimate strength and direction of association between two interval/ratio
level variables. Used to create a summary
measure that reflects the covariation between two interval/ratio variables, the Pearson
Correlation Coefficient presented here can range from a -1.00 to 1.00. A positive coefficient indicates the values of
variable A vary in the same direction as variable B. A negative coefficient indicates the
values of variable A and variable B vary in opposite directions.
The
following data were collected to estimate the correlation between years of formal
education and income at age 35.
|
Susan |
Bill |
Bob |
Tracy |
Joan |
Education
(years) |
12 |
14 |
16 |
18 |
12 |
Income
($1000) |
25 |
27 |
32 |
44 |
26 |
Verify
Conditions for using Pearson
r
Interval/ratio
data must be from paired observations.
A
linear relationship should exist between the variables
-- verified by plotting the data on a scattergram.
No
extreme values in the data
Y:
Income
44.0|
*
|
|
|
34.5|
|
*
|
|
| * *
25.0| *
---------------|--------------|
12.0 15.0 18.0
X:
Education
Compute
Pearson's
r
|
Education |
Income |
|
|
|
|
(Years) |
($1000) |
|
|
|
Name |
X |
Y |
XY |
X2 |
Y2 |
Susan |
12 |
25 |
300 |
144 |
625 |
Bill |
14 |
27 |
378 |
196 |
729 |
Bob |
16 |
32 |
512 |
256 |
1024 |
Tracy |
18 |
44 |
792 |
324 |
1936 |
Joan |
12 |
26 |
312 |
144 |
676 |
S
= |
72 |
154 |
2294 |
1064 |
4990 |
n
= |
5 |
|
|
|
|



Interpret
A
positive coefficient indicates the values of variable A vary in the same direction as
variable B. A negative coefficient indicates the values of variable A and variable B vary
in opposite directions.
Characterizations
of Pearson
r
.9
to 1 very high correlation
.7
to .9 high correlation
.5
to .7 moderate correlation
.3
to .5 low correlation
0
to .3 little if any correlation
In
this example, there is a very high positive correlation between the variation of education
and the variation of income. Individuals with higher levels of education earn more than
those with comparably lower levels of education.
Determine
Coefficient
of Determination

Eighty-seven
percent of the variance
displayed in the income variable is associated with the variance displayed in the
education variable.
Hypothesis
Testing for Pearson r
Assumptions
Data
originated from a random sample
Data
are interval/ratio
Both
variables
are distributed normally
Linear
relationship and homoscedasticity
Determine
statistical significance based on a Pearson
r of .933 for annual income and education obtained from a national random sample of 20
employed adults.
State
the Hypothesis
Ho:
There is no association
between annual income and education for employed adults.
Ha:
There is an association
between annual income and education for employed adults.
Set
the Rejection Criteria
Determine
the degrees of freedom (df) df=n 2 or
20-2=18
Determine
the confidence level, alpha
(1-tailed or 2-tailed)
Use
the critical values from the t distribution at df=18
tcv
@ .05 alpha
(2-tailed) = 2.101
Compute
Test Statistic

Decide
Results
Since
the test statistic 11.022 exceeds the critical value 2.101, there is a statistically
significant association
in the national population between an employed adult's education and their annual income.
Software Output Example
|