OVERVIEW: The Advanced Placementsyllabus lists Inference for the slope ofleast squares regression line as a testingtopic. When one calculates a least squares regression line y(hat) =b_{0} +b_{1}x, theslope b_{1} isan unbiased estimate of the slope ß of the true regressionline. It is possible to calculate a confidence interval for ßand to test hypotheses about the value of the ß.
What follows is an attempt to illustrate theconcepts in this section using the data presented in exercise 14.1 onpage 760. It is important to check the text reading for assumptionsfor regression inference and other related items.
The table shows data and related calculations fromexercise 14.1.
FEMUR | HUMERUS | Predicted y from LSRL | Residuals | (Residuals)^{2} | Square of x deviations. |
| | | | | |
| | | -.8226 | .6767 | 408.04 |
| | | -.3668 | .1346 | 4.84 |
| | | 3.0425 | 9.2567 | .64 |
| | | -.9420 | .8874 | 33.64 |
| | | -.9110 | .8300 | 249.64 |
| | 11.7853 | 696.80 |
The true regression line has the formm_{y} = a + ßx
Using the TI-83 and the table above, we obtain thefollowing results:
Mean of x values = x(bar) =58.2.
Mean of y values = y(bar) = 66.
LSRL: y(hat) = -3.660 + 1.1969x.
Review thought: The point (58.2,66) is apoint on the LSRL. Check it out!
r = 0.99415.
-3.660 is an estimate for the population parametera
. 1.1969 is an estimate for the population parameterß.
Sum of residuals = 0 (as expected).
The population parameter, s, which is the standarddeviation of the y values, is unknown. It is estimated by s, which isthe standard error about theline. The formula for s is
In this example, s =
One can calculate a
SE_{b} = s
/sqrt [sum (x-x(bar) )^{2} ]
For the example above, SE_{b} = 1.9820
The confidence interval formula has theform
b plus/minust*(SE_{b}),
where t* is the appropriate value from thet-distribution table.
For the example above, the 95% confidence intervalfor ß,using 5 - 2 = 3 degrees of freedom, is
1.1969 plus/minus 3.182(.07508) =1.1969 plus/minus 0.2389 = [0.958,1.4358].
Interpretation: We are 95% confident that, on theaverage, each centimeter increase in femur length causes an increasein humerus length that is between 0.958 cm. and 1.4358 cm.
It is also possible to test hypothesis about thevalue of the slope ß
H_{0}:
ß = 0.
Basically, this is saying there is no correlationbetween the variables x and y. (The text provides a good explanationof this null hypothesis on pages 763 and 764.)
To test the stated H_{0}, we calculate thet-statistic
t = b/SE_{b} with n - 2 degrees offreedom.
In the above example, the calculated t is
t = 1.1969/.07508 = 15.94 with 5 - 2 =3 degrees of freedom.
Using the TI-83 to get a one-sided P-value, wehave tcdf(
Note: Problem #6 in Section II of the 2001 Advanced Placement Statistics Examination has a solution in which inference on the slope of a regression line could be utilized. You can see a detailed solution to this actual AP problem by going to the home page of Herkimer's Hideaway, taking the link to WRITINGS AND REFLECTIONS, and then the link to item #68. Please note that this link does not contain a statement of the problem, just a detailed solution. The location does provide a link to the College Board, where you can get the problem statement if you don't have it available. (The College Board does not allow for the copying of an actual problem statement on a site such as this.) |