Sanderson M. Smith

 This document was prepared by Diann Resnick, Bellaire High School, Bellaire, Texas. It is presented here in Herkimer's Hideaway with Diann's permission.

Common Student Mistakes on 2001 AP Statistics Exam

There were some common mistakes that students made during this exam. These mistakes were not unique to one specific question or subsection.

Many students:

• Did not answer the questions in the context of the problem. This is always necessary for a complete answer.
• Did not seem to know when a test of significance was required for a complete answer and when a generalized discussion of the data or graph was all that was necessary.
• Seemed caught up in writing about things that had no bearing on the answer to the question and that were not statistical in nature.

In general, whenever students conduct any test of inference there are specific steps to follow. This includes all four of the following:

1. State the hypotheses in the context of the problem.

2. Name the test used and why it was used. Check - not just name - the conditions or assumptions for the test used.

3. Carry out the mechanics of the test and give the test statistic, P-value, and degrees of freedom, if applicable.

4. Write a conclusion. A student MUST link the test statistic to the conclusion. This can be similar to; since the P-value is so small (alpha<.05), I reject the null hypothesis and conclude that there is a significant relationship between GPA and mean number of credit hours per semester for students who successfully completed the Ph.D. program.

=====================================

==========

Question 1:

In general, many students are still having difficulty communicating what they are thinking. They often give an incorrect statement that contradicts a correct statement.

Some students in part (a) and part (b):

1. Were unable to calculate outliers. The student understood the meaning of an outlier, but did not know an algorithm for computation.

2. Mixed the mean and standard deviation measure with the quartile and IQR measure i.e., used mean with IQR and median with standard deviation.

3. Gave incorrect values to use as multipliers for either standard deviation or the IQR.

4. Made statements that exactly one outlier existed or that more than one outlier existed without being able to support this statement.

5. Failed to recognize that the rule for identifying outliers is two-sided.

6. Commented on the idea of skewness, but confused skewness with outliers.

7. Assumed the data was normal and then, based on the empirical rule, gave percentages for the amount of data within two or three standard deviations of the mean.

8. Computed boundary values but never determined whether outliers occurred outside these boundary values.

9. Described extreme and moderate/mild outliers. If the student did this,full credit was awarded for this part of the problem.

10. Gave implausible descriptions of outliers, such as outliers within the middle 50% of the data, or outliers needing to be located outside the maximum and minimum values.

11. Did not seem to understand what Q1, Q2, or IQR represent. They seemed to think that Q1, Q3 and IQR represented a range of values or data, not a single number.

12. Tried to write everything they knew whether it had relevance or not. They assumed that the data were a simple random sample from a population that was normally distributed and independent.

13. Drew a boxplot and said that looking at the boxplot, it appeared as if there were (or were not) outliers.

Some students in part (c):

1. Mentioned that 10 inches of rain was within one standard deviation of the mean but did not address whether 10 inches of rain was common or uncommon.

2. Talked about the fact that 10 inches was not an outlier, but did not address where 10 inches was relative to the entire data set.

3. Described 10 inches as unusual / usual without support.

==========

Question 2:

Some students:

1. Did not recognize this as an expected value problem and used probability to try to calculate a break-even point.

2. Tried to use combinatorics to calculate the break-even point, but failed to state that the probability of each repair had to be independent from year to year.

3. Used a min/max argument (they received partial credit for this approach).

4. Used a simulation to calculate the number of repairs that would be expected in 3 years and then calculated the expected costs.

5. Did not recognize that the expected cost was a fixed value, not a random variable.

6. Calculated the expected cost for the machines for one year, not three, or just calculated the repair costs - not total costs - for their answer.

7. Did not assume that the repairs from year to year were independent.

==========

Question 3:

Some students:

1. Incorrectly tried to use a single digit assignment to work this problem.

2. Gave the simulated number of each prizewinner, not how many winners there were in each trial.

3. Had difficulty in communicating their thoughts in concise, coherent sentences.

4. Did not appear to know how to use a random number table. They tried to use their calculator to work this problem, and for the most part, were unsuccessful using this approach.

5. Did not use digits 01 - 50 to represent the winning tickets, but instead used the digits 00 - 99. They then failed to show which two digits represented the winning ticket. They also did not address the issue that repetitions were not allowed.

6. Did not address the issue that repetitions were not allowed in the problem. This needed to be clearly stated in the student?s commentary or illustrated in his or her work.

7. Did not know what to do with numbers that did not fall within the digits 01 - 50.

8. Did not clearly state how to assignment of prizes to tickets drawn.

9. Failed to give a stopping rule; that is, did not state that when \$300, or more, was awarded the game was over.

==========

Question 4:

Some students in part (a):

1. Incorrectly chose blocking scheme B. In most of these cases it was clear that the students thought the shaded/non-shaded regions represented the different varieties of trees (treatment groups) instead of different blocks. Consequently, many argued that scheme A was a bad choice since that would put all of one variety next to the forest and the other variety away from the forest. Furthermore, since the shaded/non-shaded regions in scheme B had equal exposures to the forest and away from the forest, it was a better choice to make the conditions for each variety as equal as possible.

2. Incorrectly stated that the purpose of blocking is to test the effect of the forest (or to compare the trees next to the forest with the trees away from the forest). Rather, the purpose is to reduce the variability of the response by making sure that each treatment group had equal exposure to the forest. In a completely randomized experiment, 3 or 4 of trees from one variety could end up next to the forest, but blocking correctly eliminates this possibility.

3. Referred to the forest as a "confounding variable" although they were aware of the variable and used it as a blocking variable. They did not seem to understand that random assignment reduces bias by equalizing conditions caused by a confounding variable and that by blocking one is controlling a variable that can cause bias.

Some students in part (b):

1. Were too generic in their answers. Students made statements such as "the purpose of randomization is to reduce bias" or "to eliminate confounding variables" or "to give each tree an equal chance to be on each plot". These statements earned no credit since they failed to explain WHY randomization would be effective. Again, as in all problems, answers must be written within the context of the problem. It seemed as if many students were using rote phrases from textbooks or teachers. The student did not demonstrate an understanding of the concept of randomization as it pertained to the assignment of trees to plots within each block.

2. Claimed that randomization would "eliminate (or minimize) the effects of extraneous variables (such as soil and water)" when in fact, randomization only evens out (or equalizes) the effects between the two varieties. The effects are still there, but after randomization, they should affect both varieties of trees with equal probability.

3. Described possible confounding variables (soil, water, etc.) and stated that without randomization, one tree might be in more favorable conditions than another. However, this is only a problem if one variety of tree is systematically put in conditions that are more favorable.

4. Were concerned with characteristics of the trees (height, health, etc.) instead of characteristics of the plots (soil, water sources).

5. Continued to focus on the effects of the forest, failing to realize that their blocking scheme should have already accounted for this problem.

6. Confused randomization with choosing a random sample of trees. The students did not seem to know the difference between a random sample of trees and a random assignment of trees to plots.

7. Claimed that randomization reduces variability or increases statistical significance.

8. Explained HOW to randomize instead of WHY randomization is necessary.

9. Who chose blocking scheme B gave inconsistent answers in part (b). For example, in part (a) they thought of the shaded region as one variety of tree and then in part (b) randomized within that region instead of randomizing which variety went in the shaded region.

10. Seemed to understand the concept of blocking, but not what a block was.

11. Thought that randomization would not put 2 of the same types of trees together.

==========

Question 5:

To receive full credit for this question, the student needed to demonstrate all four parts of a test of significance - Hypotheses, Identification of correct test and conditions, Mechanics, and Conclusion. Some students just drew graphs or calculated means of the data and talked about these results without doing a formal inference test.

Some students in Part 1, when attempting to identify the hypotheses:

1. Used a one sided alternative rather than a 2 sided alternative.

2. Did not mention means or use the symbol for means in their hypotheses.

3. Did not define the variables they used and/or used non-standard notation.

4. Used the sample mean notation instead of m in stating hypotheses.

5. Stated only one hypothesis.

6. Used p for the notation of the mean rather than m.

7. Reversed the null and alternative hypotheses.

Some students in Part 2, when attesting to identify the correct test and conditions:

1. Named or used the 2 independent sample t-test rather than a matched pairs t-test.

2. Used a chi-square test or a test for proportions (the incorrect tests received no credit).

3. Used a Sign Test (a correct test, but not the preferred test when a normality assumption is reasonable).

4. Did not clearly name the test they used.

5. Ignored the conditions necessary to run a t-test.

6. Listed the conditions necessary for a t- test, but gave no evidence of checking these conditions by drawing appropriate graphical displays. Checkmarks alone were often used, but were not sufficient for receiving credit for checking conditions.

7. Listed the conditions for a wrong test. For example, listed and checked the conditions for a two sample t test when using a matched pairs t - test or listed and checks the conditions for a matched pairs t - test when using a two sample t - test.

8. Did not make linkage statements between plots and the assumptions they used.

Some students in Part 3, in relation to mechanics:

1. Reported only calculator output (although reluctantly accepted, this is not the preferred work).

2. Wrote directions about how to do something on the calculator. This was unnecessary and often made the solution difficult to read.

3. When reporting the p-value from a calculator, the student forgot to copy scientific notation and did not recognize that a p value over 1 is not reasonable.

4. Reported a P-value without reporting a test statistic.

5. Drew incorrect curves to illustrate P-value or rejection regions next to correct calculator output.

6. Used the population standard deviation (given on the calculator) instead of the sample standard deviation.

7. Forgot to double the P-value for a 2-sided hypothesis test.

8. Wrote the complete output of their calculator screen without seeming to understand what the numbers meant.

Some students in Part 4, relating to the conclusion:

1. Failed to write appropriate linkage between the P-value/critical value and the conclusion.

2. Failed to use the term "mean difference." The most common statement was "difference in amount," not the mean difference in amount.

3. Failed to write their conclusion in context of the problem.

4. Used the expression "accept H0" or "accept Ha" rather than reject H0 or fail to reject H0.

5. Made their conclusions backwards based on their P-value.

6. Did not have their conclusion agree with their stated hypotheses.

==========

Question 6:

In general this question provided a lot of information (computer output that included regression coefficients needed to write the equation of the LSRL, output for a t-test for slope, etc.) that students failed to take advantage of in creating their solution. They wasted time recalculating given information.

Some students in Part (a):

1. Failed to realize that the question asked to compare only GPAs. Although data for credit hours was provided, it is not used for the univariate comparison.

2. Provided a scatterplot, which is not a univariate display for the data.

3. Said the small data sets are normal instead of describing the shape.

4. Talked about their graphs, but did not draw the graphical display from their calculator.

Some students in Part (b):

1. Failed to realize that when the question asked about "a significant relationship" that the answer required an inference procedure. Many students said that high correlation (with r = -0.8718 or r2 = 0.76) was sufficient to indicate a significant relationship.

2. Did not comment that the problem stated that a linear fit was good model for the data and that the assumptions needed for inference were reasonable. Students did not use this statement to support their contention that a strong correlation was significant.

3. Incorrectly used a two-sample t-test procedure comparing mean GPA to mean credit hours. It was surmised that many students did not study or get to inference on the slope of a line before the exam.

4. Failed to realize that the computations for the inference for slope had already been done and were given in the computer output provided, or they did not know how to use the computer output. (They did not lose credit if they redid the computations, but they wasted time.)

Some students in Part (c):

1. Failed to recognize the bivariate nature of the problem and did their work as a univariate problem. For instance, they compared the applicant's values to the mean GPA and mean credit hours separately rather than in combination.

2. Failed to compare the new applicant to both groups before making a decision on his or her likely success.