12.4 Testing a Null Hypothesis with a Theoretical Probability Distribution

The preceding sections taught us how to conduct a significance test. Formulate a null hypothesis that equates a population characteristic (parameter) to a particular value, which is a boundary value in the case of a one-sided test. Then construct a sampling distribution with the hypothesized (boundary) value as centre and use it to calculate a p value for a sample. If the p value is below the significance level (\(\alpha\)), the test is statistically significant, so we reject the null hypothesis.

We have not discussed yet how we construct the sampling distribution. Chapter 2 presented three ways: bootstrapping, an exact approach, and approximation of the sampling distribution with a theoretical probability distribution. The last option is the most popular, so let us discuss it first. Exact approaches and bootstrapping are discussed in the next section.

A theoretical probability distribution links sample outcomes such as a sample mean to probabilities by means of a test statistic. A test statistic is named after the theoretical probability distribution to which it belongs: z for the standard-normal or z distribution, t for the t distribution, F for the F distribution and, surprise, surprise, chi-squared for the chi-squared distribution.

Figure 12.1: Sample size and critical values in a one-sample t test.

Figure 12.1 uses the t distribution to approximate the sampling distribution of average media literacy in a random sample of children.

A test statistic is calculated from the sample statistic that we want to test, for instance, the sample proportion, mean, variance, or association, but it uses the null hypothesis as well. A test statistic more or less standardizes the difference between the sample statistic and the population value that we expect under the null hypothesis.

The exact formula and calculation of a test statistic is not important to us. Just note that the test statistic is usually zero if the sample outcome is equal to the hypothesized population value. In Figure 12.1, for example, the t value of a sample with mean 5.5 is zero if the hypothesized population mean is 5.5. The larger the difference between the observed value (sample outcome) and the expected value (hypothesized population value), the more extreme the value of the test statistic, the less likely (lower p value) it is that we draw a sample with the observed outcome or an outcome even more different from the expected value, and, finally, the more likely we are to reject the null hypothesis.

We reject the null hypothesis if the test statistic is in the rejection region. The value of the test statistic where the rejection region starts, is called the critical value of the test statistic. In Section 3.4, we learned that 1.96 is the critical value of z for a two-sided test at five per cent significance level in a standard-normal distribution. In a z test, then, a sample z value above 1.96 or below -1.96 indicates a statistically significant test result.

Probability distributions other than the standard-normal distribution, however, do not have fixed critical values. Their critical values depend on the degrees of freedom of the test, usually abbreviated to df. The degrees of freedom of a test may depend on sample size, the number of groups that we compare, or the number of rows and columns in a contingency table. We do not have to worry about this.

The t distribution is an example of a probability distribution for which the critical values depend on the degrees of freedom of the test. In this case, the degrees of freedom are determined by sample size. Larger samples have more degrees of freedom and, as a consequence, they have slightly lower critical values. For samples that are not too small the critical values of t are near 2. You may have noticed this in Figure 12.1.

APA requires us to report the degrees of freedom. If SPSS reports the degrees of freedom, usually in a column with the header df, you should include the number between brackets after the name of the test statistic. If, for example, a t test has 18 degrees of freedom and the t value is 0.63, you report: t (18) = 0.63. Note that the F test statistic has two degrees of freedom, both of which should be reported (separated by a comma and a blank space), for example, F (2, 87) = 3.13.