Practical Significance Needed for Large Sample Comparisons

by Art Gutman Ph.D., Professor, Florida Institute of Technology

The case is Apsley v. Boeing, decided on August 27, 2012 [2012 U.S. App. Lexis 18161]. Boeing sold two of its facilities to Spirit AeroSystems and terminated an entire workforce of more than 10,000 employees on June 16, 2005, of which 8,354 were rehired the next day by Spirit. Interestingly, older workers made up the vast majority of the workforce both before and after the sale, but a lower percentage of older than younger workers were rehired. The plaintiffs sued on behalf of 700 older employees who were not rehired. There were several claims, but the most important for present purposes were pattern or practice and adverse impact based on age. All claims were dismissed by the district court in a summary judgment, and the dismissals were upheld by the 10th Circuit.

One of the plaintiffs’ experts (Dr. Mann) argued that his model predicted that 8,028 older workers should have been recommended for rehire by chance alone, compared to 7,968 that actually were. His argument was that this difference (of 60 additional older workers) that should have been recommended for rehire represented a 5+ standard deviations.” There was also a 4.5 standard deviation difference in older workers rehired (a difference of 48 workers). The 10th Circuit acknowledged that this is sufficient to demonstrate statistical significance under the Supreme Court’s ruling in Castaneda v. Partida (1977) [430 U.S. 482] (a pattern or practice jury pool case), subsequently echoed by the Supreme Court in Hazelwood v. United States (1977) [433 U.S. 299] (a pattern or practice Title VII case), but is not sufficient to establish practical significance. Echoing the ruling of the district court, the 10th Circuit ruled:

Nonetheless, the district court concluded that this was a case where "a large number of standard deviations simply will not be enough." Apsley, 722 F. Supp. 2d at 1239. In the passages excerpted above, the court reasoned that small discrepancies in large samples mean little even if they are statistically significant. The court noted that any discrepancy would have disappeared if Boeing had recommended sixty more older workers or if forty-eight more older workers had been hired by Spirit. It also pointed out that "Boeing recommended and Spirit hired over 99% of the workers that Dr. Mann's model predicted." Thus, it concluded that any disparity was " practically insignificant."

In short, the 10th Circuit recognized how much easier it is to establish statistical significance with larger samples than smaller samples, and deemed the discrepancies reported by Dr. Mann as being practically not significant given a workforce of over 10,000 employees in which the vast majority were older workers to begin with. Other factors were considered (e.g., statements by managers and executives at Boeing), but in the end, it was the absence of practical significance that served to discredit the pattern or practice and adverse impact charges.

As an aside, this ruling stands in stark contrast with Bew v. Chicago (2001) [252 F.3d 891], in which the 7th Circuit ruled there was adverse impact based on race on a test for probationary police officers. The test was administered to 5,191 applicants and yielded pass rates of 99.96% and 98.24%, respectively, for whites and blacks. What stands out here, is that only 33 of 5,191 were rejected, but 32 of 33 of them were black. Nevertheless, the 7th circuit ruled there was adverse impact because a test of independent proportions revealed a nearly five standard deviation difference between these failure rates (although it should be noted however, that the City of Chicago successfully proved that the test was job related and consistent with business necessity, and that its cutoff score was valid).