Using Advanced Statistics
The following material isn’t for everyone. But if you have taken a high school algebra course and would like a chance to win thousands of dollars at the International Science and Engineering Fair, then this may be the page for you.
Accompanying every sample mean or proportion is a measure of its spread called the standard error. If the sample mean deviates more the two standard errors from the sample, then the researcher may have found something. Here are two examples:
- How to analyze a survey: John believes that farmers in his community have excessive skin cancer. He bases his hypothesis on the assumptions that ultraviolet light from the sun causes skin cancer and that farmers are out in the sun far more than most other workers. He creates a survey, has it approved by the Institutional Review Board at Dakota Wesleyan, and submits it to a random sample of 200 area farmers. He finds that 10 of them have been afflicted with the disease. Statistically John will:
- Find the ratio of farmers with skin cancer to those without skin cancer and report it:
- Find the standard error of the proportion just above and report it:
- About 95% of the time, John can expect the true proportion to lie within these limits:
If the limits trap the national average, then John has nothing new to report. Library research reveals than the national skin cancer rate is about .3 percent or about .003. That’s way below John’s lower limit. John can state with some confidence that it’s time for farmers to start wearing hats when they work outside.
- How to report a mean: Jane wishes to determine if playing music helps corn grow.. Jane plays a radio night and day to 20 growing corn plants in a corner of her father’s field. The mean height of plants in that field is 58 inches. Are Jane’s corn plants taller than her father’s corn plants? To find out Jane will: Report the heights of the corn plants:
64 | 48 | 55 | 68 | 72 | 59 | 57 | 61 | 63 | 60 | 60 | 43 | 67 | 70 | 65 | 55 | 56 | 64 | 61 | 60
- Find the average height by adding all heights and dividing by 20
- Square all the scores and add the squared values
- Square the sum of all values found in the first step and divide it by 20.
- Subtract the two values, divide by 20-1 and take a square root.
This number is the standard deviation of Jane’s corn plants. Now Jane computes the standard deviation of the mean by dividing this number by the square root of 20:
About 95% of the time, Jane can expect the true mean to lie within these limits:
The height of the corn in her father’s field is 58 inches. Since that’s within the confidence limits, it’s doubtful that music helps plants grow.
- Are boys denser than girls? Sue and Ann weigh 5 senior high school boys and 5 senior high school girls. They then find the volumes of these students by dunking them in water and measuring the overflow. Their statistics look like this:
For 10 Students lbs cu ft density
B1 145 2.37 61.2
B2 205 3.33 61.6
B3 183 3.05 60.0
B4 192 3.11 61.7
B5 127 2.17 58.5
G1 145 2.54 57.1
G2 123 2.17 56.7
G3 173 3.10 55.8
G4 171 2.88 59.4
G5 120 2.00 55.0
The averages and standard deviations are 60.60 ±1.80 pounds per cubic feet for the males and 56.79 ± 2.73 points for the females. These statistics are computed as in the example just above. In this case we need to determine if one mean is greater than the other. For that we use what is called a student’s “t” statistic. Let M1 and M2 be the two means just above and S1 and S2 be the standard deviations. Let the number of students in the first group be N1 and the number of students in the second group be N2. Then the “t” statistic is defined to be
In order to determine if this value is significant in a one-tailed test, you need to know the degrees of freedom. For small data sets, the formula is formidable:
A “t” table tells us that a t-statistic of 4 or better with 7 degrees of freedom will happen no more than 5 times out of 1000. Thus, it’s highly unlikely that these results came about by chance.