12.1 Testing a Null Hypothesis with a Theoretical Probability Distribution

The preceding sections taught us how to conduct a significance test. Formulate a null hypothesis that equates a population characteristic (parameter) to a particular value, which is a boundary value in the case of a one-sided test. Then construct a sampling distribution with the hypothesized (boundary) value as centre and use it to calculate a p value for a sample. If the p value is below the significance level (\(\alpha\)), the test is statistically significant, so we reject the null hypothesis.

We have not discussed yet how we construct the sampling distribution. Chapter 2 presented three ways: bootstrapping, an exact approach, and approximation of the sampling distribution with a theoretical probability distribution. The last option is the most popular, so let us discuss it first. Exact approaches and bootstrapping are discussed in the next section.

A theoretical probability distribution links sample outcomes such as a sample mean to probabilities by means of a test statistic. A test statistic is named after the theoretical probability distribution to which it belongs: z for the standard-normal or z distribution, t for the t distribution, F for the F distribution and, surprise, surprise, chi-squared for the chi-squared distribution.

Figure 12.1: Sample size and critical values in a one-sample t test.

Figure 12.1 uses the t distribution to approximate the sampling distribution of average media literacy in a random sample of children.

The exact formula and calculation of a test statistic is not important to us. Just note that the test statistic is usually zero if the sample outcome is equal to the hypothesized population value. In Figure 12.1, for example, the t value of a sample with mean 5.5 is zero if the hypothesized population mean is 5.5. The larger the difference between the observed value (sample outcome) and the expected value (hypothesized population value), the more extreme the value of the test statistic, the less likely (lower p value) it is that we draw a sample with the observed outcome or an outcome even more different from the expected value, and, finally, the more likely we are to reject the null hypothesis.

We reject the null hypothesis if the test statistic is in the rejection region. The value of the test statistic where the rejection region starts, is called the critical value of the test statistic. In Section 3.4, we learned that 1.96 is the critical value of z for a two-sided test at five per cent significance level in a standard-normal distribution. In a z test, then, a sample z value above 1.96 or below -1.96 indicates a statistically significant test result.