12.2 Take-Home Points

  • Null hypothesis significance test results should be interpreted in relation to sample size and, if possible, test power.

  • Statistically significant results do not have to be relevant or important. A small, negligible difference between the sample outcome and the hypothesized population value can be statistically significant in a very large sample with high test power.

  • A practically relevant and important difference between the sample outcome and the hypothesized population value does not have to be statistically significant in a small sample because of low test power.

  • Give priority to effect size over statistical significance in your interpretation of results.

  • A confidence interval shows us how close to and distant from the hypothesized value the plausible population values are. It helps us to draw a more nuanced conclusion about the result than a null hypothesis significance test.

  • Applying statistical inference to data other than random samples requires justification of either a theoretical population or a data generating process with a particular probability distribution.

Bullock, J. G., & Ha, S. E. (2011). Mediation analysis is harder than it looks. In J. N. Druckman, D. P. Green, J. H. Kuklinski, & A. Lupia (Eds.), Cambridge handbook of experimental political science (pp. 508–522). Cambridge University Press. https://doi.org/10.1017/CBO9780511921452.035
Cohen, J. (1969). Statistical power analysis for the behavioral sciences. San Diego, CA: Academic Press.
Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. New York: Routledge.
Davis, J. A. (1985). The logic of causal order. Beverly Hills, CA: Sage.
de Groot, A. D. (1969). Methodology: Foundations of Inference and Research in the Behavioral Sciences. The Hague: Mouton.
Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann.Statist., 7(1), 1–26.
Efron, Bradley. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82(397), 171–185. https://doi.org/10.1080/01621459.1987.10478410
Erdogan, B. Z. (1999). Celebrity endorsement: A literature review. Journal of Marketing Management, 15(4), 291–314.
Fisher, R. A. (1919). The correlation between relatives on the supposition of mendelian inheritance. Transactions of the Royal Society of Edinburgh, 52(2), 399–433. https://doi.org/10.1017/S0080456800012163
Fisher, Ronald Aylmer. (1955). Statistical methods and scientific induction. Journal of the Royal Statistical Society.Series B (Methodological), 17(1), 69–78. Retrieved from http://www.jstor.org/stable/2983785
Frick, R. W. (1998). Interpreting statistical testing: Process and propensity, not population and random sampling. Behavior Research Methods, Instruments, & Computers, 30(3), 527–535. https://doi.org/10.3758/BF03200686
Gauss, C. F. (1809). Theoria motus corporum coelestium in sectionibus conicis solem ambientium auctore carolo friderico gauss. sumtibus Frid. Perthes et IH Besser.
Hainmueller, J., Mummolo, J., & Xu, Y. (2016). How much should we trust estimates from multiplicative interaction models? Simple tools to improve empirical practice. https://doi.org/10.2139/ssrn.2739221
Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Press.
Hoekstra, R., Morey, R. D., Rouder, J. N., & Wagenmakers, E.-J. (2014). Robust misinterpretation of confidence intervals. Psychonomic Bulletin & Review, 21, 1157–1164.
Holbert, R. L., & Park, E. (2019). Conceptualizing, Organizing, and Positing Moderation in Communication Research. Communication Theory. https://doi.org/10.1093/ct/qtz006
Laplace, P. S. de. (1812). Théorie analytique des probabilités (Vol. 7). Courcier.
Lehmann, E. L. (1993). The fisher, neyman-pearson theories of testing hypotheses: One theory or two? Journal of the American Statistical Association, 88(424), 1242–1249. https://doi.org/10.1080/01621459.1993.10476404
Lyon, A. (2014). Why are normal distributions normal? The British Journal for the Philosophy of Science, 65(3), 621–649.
McCracken, G. (1989). Who is the celebrity endorser? Cultural foundations of the endorsement process. Journal of Consumer Research, 16(3), 310–321.
Neyman, J. (1937). Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society of London.Series A, Mathematical and Physical Sciences, 236(767), 333–380.
O’Keefe, D. J. (2007). Brief report: Post hoc power, observed power, a priori power, retrospective power, prospective power, achieved power: Sorting out appropriate uses of statistical power analyses. Communication Methods and Measures, 1(4), 291–299. https://doi.org/10.1080/19312450701641375
Sawilowsky, S. (2009). New Effect Size Rules of Thumb. Journal of Modern Applied Statistical Methods, 8(2). https://doi.org/10.22237/jmasm/1257035100
Wasserstein, R. L., & Lazar, N. A. (2016). The ASA Statement on p-Values: Context, Process, and Purpose. The American Statistician, 70(2), 129–133. https://doi.org/10.1080/00031305.2016.1154108
Wilkinson, L. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54(8), 594.