Practical versus Statistical Significance

In our previous blog on the topic of statistical significance, we discussed how to interpret the meaning of "statistically significant." In this blog, we want to expand on the topic by discussing the difference between statistical and practical significance.

As mentioned in the previous blog, when a group difference is statistically significant, it only indicates that it is unlikely, but not impossible, that the difference occurred by chance. A larger standard deviation is not an indication of the magnitude of the group difference. However, it is an indicator of the probability that the difference observed may not be due to chance.

Because the values of many statistical tests are driven in part by sample size, it is common that a very small difference between two groups is statistically significant, merely because of a large sample size. For example, a few years ago, we conducted a proactive analysis in which a $36 per year difference in base pay between men and women was statistically significant - partially due to the sample size (3,000 people in the position) and partially due to the low variability in salaries (the difference between the highest and lowest salary was only $2,500). Conversely, we have seen group differences of $20,000 not be statistically significant due to small sample sizes and high variability.

This is where practical significance comes into play. Statistical significance allows one to try and interpret a difference, whereas practical significance determines whether the difference is big enough to be of concern. Using our previous example, a $36 annual difference in salary, although statistically significant, is hardly of a magnitude that one would suspect sex discrimination.

In adverse impact analyses, statistical significance is often determined by the number of standard deviations or a probability level (i.e. Fisher's exact test), whereas practical significance is determined by effect sizes like impact ratios and practical swap rules (e.g., if only two more women were hired, the difference would not have been statistically significant).

The moral to our story is that measures of practical significance should always follow any statistically significant finding.

Mike Aamodt, Principal Consultant, and Yevonessa Hall, Consultant at DCI Consulting Group