Amplifying the Noise: The Dangers of Post Hoc Power Analyses

J Surg Res. 2021 Mar:259:A9-A11. doi: 10.1016/j.jss.2019.09.075. Epub 2020 Aug 22.

Abstract

Small sample sizes decrease statistical power, which is a study’s ability to detect a treatment effect when there is one to be detected. A power threshold of 80% is commonly used, indicating that statistical significance would be expected four of five times if the treatment effect is large enough to be clinically meaningful. This threshold may be difficult to achieve in surgical science, where practical limitations such as research budgets or rare conditions may make large sample sizes infeasible. Several researchers have used “post hoc” power calculations with observed effect sizes to demonstrate that studies are often underpowered and use this as evidence to advocate for lower power thresholds in surgical science. In this short commentary, we explain why post hoc power calculations are inappropriate and cannot differentiate between statistical noise and clinically meaningful effects. We use simulation analysis to demonstrate that lower power thresholds increase the risk of a false-positive result and suggest logical alternatives such as the use of larger p-values for hypothesis testing or qualitative research methods.

Publication types

  • Letter
  • Comment

MeSH terms

  • Power, Psychological*