4.11 Take-Home Points
We use a statistical test if we want to decide on a null hypothesis: reject or not reject?
The decision rules should be specified beforehand: Decide on the direction of the test (one-sided or two-sided) and the significance level.
The null and alternative hypotheses always concern a population statistic. Together they cover all possible outcomes for the statistic. The null hypothesis always specifies one (boundary) value for the population statistic.
We reject the null hypothesis if a test is statistically significant. This means that the probability of drawing a sample with the current or a more extreme outcome (even more inconsistent with the null hypothesis) if the null hypothesis is true (conditional probability) is below the significance level.
A statistically significant test does not prove that the null hypothesis is false. We can make a Type I error: rejecting a true null hypothesis.
The 95% confidence interval includes all null hypotheses that would not be rejected by our current sample in a two-sided test at five per cent significance level. It contains the population values that are not sufficiently contradicted by the sample data.
The calculated p value is only correct if the data is used for no more than one null hypothesis test and the null hypothesis was formulated beforehand.
If the same data is used for more null hypotheses tests, the probability of a Type I error increases. We obtain too many significant results, which is called capitalization on chance.