## Statistics Appendix## By Michelle Harris, Rick Nordheim, and Janet Batzli Biocore 304 Spring 2007 When analyzing data, the
After plotting your data, it is helpful to provide some summary (descriptive) statistics for your data. These provide A common measure of data dispersion, or spread, is the
The Let us think of the sample mean as estimating the population mean (the Greek letter mu). We would like to know how close is to . This measure is known as the . Suppose we have taken a random sample of 16 six-week old lizards and computed cm and a standard deviation s=2.0 cm. Then cm. The standard deviation measures the typical deviation of the length of an individual lizard whereas the standard error measures the uncertainty in how well estimates . There is a true standard deviation for the lizard length which is an inherent characteristic of the lizard. From our sample we estimate the standard deviation by 2.0; in different samples it may be somewhat larger or smaller but will not be grossly different. The standard error depends on the standard deviation I t is often useful to use and SE to compute a range of plausible values for . The most common method for this is to compute a However, the construction of this, or any other, confidence interval depends on some underlying assumptions. If your sample is reasonably symmetric and bell-shaped and your sample size is at least 25, then this interval can be thought of as a 95% confidence interval for the true population mean. This says that, on average for intervals computed in this manner, the true population mean will fall within the interval about 95% of the time. If the data are asymmetric, then such an interval may not be reliable. If the sample size is smaller than 25, then the “1.96” needs to be replaced by a larger number with the magnitude depending on the specific sample size. (For sample sizes in the range of 5 to 10, an approximate value to use is 2.5. The “correct” value to use is based on the t-distribution described in a later section.)
Often scientists wish to compare two groups to see if the means for each group are the same --- or not. For example, perhaps you wish to test whether the mean value of one group ( The key is to determine the design corresponding to your data
The more common situation is the - The individuals/observations within each sample were chosen randomly from a larger population. Therefore:
- The individuals/observations within one sample are independent of each other.
- The individuals/observations in one sample are independent of the individuals/observations in the other sample.
(Researchers must design their experimental data collection protocol carefully to satisfy these assumptions of independence.) - The population variances for each group are equal. (Population variances can be estimated from sample variances. Formal tests are possible to determine whether variances are the same or not. However, a general rule of thumb is that, for equal sample sizes, the t-test can still be used so long as the sample variances do not differ by more than a factor of 4 or 5.)
- The distributions of data for each sample should be approximately normally distributed. A normal distribution is bell-shaped with roughly equal number of scores evenly dispersed on either side of the population mean. This assumption can be checked by creating a display (perhaps a histogram) of observations for each sample. The t-test is moderately robust to departures from normality. Thus, it is generally valid, even for data that may not be entirely normal, so long as neither sample is greatly skewed. --- A test that is robust to a departure from an assumption is one that performs fairly well given the stated departures.
It is useful to formally state the underlying hypothesis for your test. Using the notation from the previous section with representing a population mean, there are now population means for each of the two groups: and . The The standard This states that the two means differ in one direction or the other. Most of the experimental hypotheses that scientists pose are alternative hypotheses. Sometimes the alternative can be “one-sided”, for example , which indicates that the null is rejected only if the mean of the first group is larger than the mean of the second. In scientific practice, it is best to use the two-sided alternative in most cases. Only if there is a strong and compelling argument for using a one-sided alternative, which is presented
1. Make sure that the underlying assumptions for the test are met. 2. Compute the mean and variance (s 2) for each of the two samples. 3. Compute a pooled variance as follows: where n 1 and n 2 are the sample sizes for the two groups and and are the variances for the two groups. (Note that we pool variances and not standard deviations!!) 4. Compute the t-statistic for the null hypothesis using the equation:
Every t-test has associated with it a value of degrees of freedom (df). (This is related to the discussion of the appropriate multiplier to use in the computation of the confidence interval as discussed above.) For this t-test: df= (n 1 + n 2 - 2). 5. After you have determined the t-statistic and the degrees of freedom, look up the associated probability of a Type I error (the p-value) in the t-table of this appendix.
By convention in many biological science disciplines, If your data indicate that there is a statistically significant difference, Why is the 0.05 level so important here? You can think of this as a threshold in the following way. If your null hypothesis (that the treatment means are the same) is true, you are willing to accept that you will reject the null hypothesis Type I error, about 5% of the time. You will often see the threshold value of 0.05 referred to as the α-value in the scientific literature.
When reporting t-test results, provide your reader with the sample means for each group and a measure of variation for each, the t-statistic, degrees of freedom, p-value, and whether the p-value (and hence the alternative hypothesis) was one or two-tailed. Here is an example of how you could concisely report the results of an independent t-test comparing mean heart rate in a sample of men and women:
The number 18 in parentheses after the t represents the degrees of freedom.
As explained above, the first step in performing a two-sample comparison is to determine whether the design is Paired designs can be more general than “before” and “after”. For example, suppose you only had one week to test the effect of depleted nutrient medium on the change in leaf size of fast plant leaves. You notice that there is a great deal of variation in the size of the seedlings you can use in your experiment, however, so you are concerned that the variation within samples would obscure any differences in the mean increase in leaf area over just one week. You could control variation between seedlings, however, by carefully matching individual plants on such characteristics as age, shoot length, etc... You would then subject one member of the pair to normal nutrient medium and the other to a depleted medium over the same 7 day-period. Leaf area is measured after 7 days for several of these pairs, and you would then test whether the mean leaf area difference between "with" and "without" normal nutrient medium pairs is significantly different from zero. The appropriate analysis of a paired design again leads to a t-test. However, the t-test will have a different (and actually simpler) form. The key is “reducing” the data for each pair to a single value – the difference. Thus, if y b and y a represent the weights before and after medication (or the leaf areas of a pair of similar plants), we define the difference d by d= y a - y b. Our test will only depend on the values of d. Again, there are assumptions that underlie this test. - The pairs are independent of each other,
*i.e*., they are chosen randomly from a large population of possible pairs.
(We need make no assumptions directly about the “y” values. The key to a useful paired design is removing the variability --- in this case the differences among individuals subject to the medication --- by looking at the difference between the “after” and “before” measurements - The distribution of the differences, the “d”s, should be approximately normal. The same comments, including that on robustness, from the independent case are valid here.
Again it is useful to formally state the underlying hypothesis for your test. Using similar notation to before, and . where μ d is the mean of the population of difference (d) values. The same choices regarding one-sided and two-sided alternatives are possible here.
1. Make sure that the underlying assumptions for the test are met.
- where n = the number of d-values (pairs) in the sample. The df for this test is n-1.
Ambrose, H.W. III and Peckham Ambrose, K. 1995. Gravetter, F.J. and Wallnau, L.B. 2000. Nordheim, E.V. and Clayton, M.K. 1997. Sokal, R.R. and Rohlf, F.J. 1981.
This semester you will subject your data to various statistical tests and generate graphs to represent the results of these analyses. Before running any statistical tests, graph your raw data first using “exploratory” graphs to get a feel for the normality of your sample and to understand what the results show. Raw data are typically not used for graphs presented in a talk or paper. Instead, use summary graphs whenever possible to show the main results of the experiment. Use a table only if you can not think of an appropriate graph. The purpose of this handout is to introduce you to some Excel summary statistics and graphing functions. Before you begin, tell your Excel program to give you the "Data analysis" option under the "Tools" option. To do this, choose Tools-->Add-ins, then check the "Analysis tool pack" box.
To sort data from least to greatest choose (Data-->Sort). Data can also be arranged according to your unique variable categories (
To have Excel compute some summary statistics, choose (Tools-->Data Analysis-->Descriptive Stats). Check the "labels in first row" box to keep your column headings. Also check "summary statistics" and “confidence levels for mean, 95%.” To find out the mean, median, mode, variance, range, standard deviation, and standard error for a particular variable, click on the Input Range icon, select the appropriate column, and then "OK." Your output will be placed on a new sheet. Is the mean reported a sample mean or a population mean?
To calculate the standard deviation of a data set, first click on an empty cell in which you want the standard deviation to be displayed. Choose the f x STDEV (standard deviation) function and select “STDEV.” You’ll have to click on the Input Range icon again to select the data, without the label in the first row. Because there is no function to calculate standard error, you'll have to choose another empty cell and type in the formula yourself as:
You can check your calculated SD and SE values against those generated in a summary statistics output file.
For sample sizes greater than about 30, the true population mean can be found within the range described by the sample mean +/- 2SE. We often don’t have the time or resources in Biocore to achieve samples of this size. For smaller sample sizes, we can instead compute the actual 95% confidence interval (CI) using a t-statistic from the t-table presented earlier in this appendix. The 95% CI is simply calculated as the sample mean +/- SE * tstat .
The number of significant digits in data cells should be no greater than the number of significant figures in the data point(s) with the least number of significant figures. For example, the mean of 1.47, 1.33 and 1.4 should be reported as 1.4. The same convention is used when reporting the standard deviation and standard error of a sample. You can use the "decrease decimal" icon in the upper right of the tool bar to do this.
You'll be using graphs often this semester. Graphs will show the main points of your experiment; use a table only if you cannot think of an appropriate graph. Before subjecting your data to any statistical tests, ALWAYS GRAPH YOUR RAW DATA FIRST!!! Recall that one of the assumptions of t-tests is that your data are more or less normally distributed;
To generate histograms chose (Tools-->Data Analysis-->Histogram). Select chart output to display your histogram. You can set the numerical data ranges or “Bins” yourself or let Excel generate them automatically. For example, a Bin column containing the numbers 3, 6, and 9 will generate a histogram showing the frequency of data points falling within the 0-3, 4-6, and 7-9 ranges. A general rule of thumb for the number of bins in your histogram is to set the #bins = square root of your sample size. To set your own Bins, create a new column in your spreadsheet that defines the maximum values for each data range. After selecting the histogram option, choose “Bin range” and then highlight the numbers in the Bin column. After the histogram appears, make it bigger to examine your data more closely. Do your data appear to be approximately normally distributed? What numerical ranges do most data points fall into? How does the histogram look when you change the Bin values? (To modify a graph once it's been generated, select the graph, then double click on the area you'd like to edit.)
See the Writing Manual, “Producing Figures Using Microsoft Excel” for directions on generating bar graphs, line graphs, and creating error bars around mean values. Try displaying your bar columns and lines with different colors and textures. Label axes appropriately. Make the background on each graph white instead of the default gray (it consumes too much ink when printed.) If you
At some point you may want to look at how change in an independent continuous variable affects a dependent variable (
Often this semester you will wish to find out whether there is a statistically significant difference between the means of two or more independent samples. If your data have satisfied the Students t-test assumptions, use this test to determine whether there is a significant difference between two independent sample means. To have Excel compute the t-statistic, select Tools--->Data Analysis--->t-test (Two sample assuming equal variances).
A paired t-test is appropriate when we are focusing on the changes in a particular variable |