Sackett et al. (Reference Sackett, Zhang, Berry and Lievens2021) published a disruptive piece that is summarized in Sackett et al. (Reference Sackett, Zhang, Berry and Lievens2023) focal article. As the authors explain in the focal article, not only did their 2021 paper show that range-restriction overcorrection has led to inflated estimates of validity for selection devices, but that the new corrections actually alter the rank ordering of predictors established in Schmidt and Hunter (Reference Schmidt and Hunter1998). Many are celebrating that structured interviews have supplanted general mental ability for the top validity spot. Others, however, were deflated by the generally shrunken effect sizes associated with the new corrections. According to Sackett et al. (Reference Sackett, Zhang, Berry and Lievens2023), many practitioners feel that these revised estimates do not help our traditional predictors compete for success in the marketplace, leaving many wondering how to effectively communicate the relative efficacy of predictors to key stakeholders. We believe that many scientists and practitioners hold unrealistic standards of success. It is time, therefore, for I–O psychologists to adopt and communicate new benchmarks for evaluating predictor effect sizes.
New benchmarks effect sizes
Campbell (Reference Campbell, Dunnette and (Eds1990) argued that unrealistic standards of success cause a sense of hopelessness among I–O psychologists engaged in predicting future job performance of job applicants. This dismal view of predictor validity has led some of them to pursue holistic methods where prediction error is less transparent. Major culprits in this sense of hopelessness are Cohen’s (Reference Cohen and Cohen1988) benchmarks for classifying correlations as small (.1), medium (.3), and large (.5). We would argue, however, that these benchmarks have set us up for failure when it comes to communicating magnitude.
So how do we get more realistic effect size benchmarks? We believe the answer is in a paper by Bosco et al. (Reference Bosco, Aguinis, Singh, Field and Pierce2015) examining nearly 150,000 effect sizes reported in Journal of Applied Psychology and Personnel Psychology between the years 1980 and 2010. They examined the distribution of effect sizes for various relation types. We focused only on the nearly 8,000 effect sizes for attitude–behavior relations, as these effect sizes involving actual behavioral outcomes are more relevant to the performance-prediction context than those involving attitude–attitude or attitude–intention relations (Bosco et al., Reference Bosco, Aguinis, Singh, Field and Pierce2015, p. 436). The authors found that, with rudimentary meta-analytic corrections, one can classify small (25th-percentile) effect sizes as r = 0.07, medium (50th-percentile) effect sizes as r = 0.16, and large (75th-percentile) effect sizes as r = 0.29. Table 1 applies these new small, medium, and large effect-size categories to the current state-of-the-science predictor effect sizes reported in Sackett et al. (Reference Sackett, Zhang, Berry and Lievens2023).
Note: Predictors are shown in order of effect size reported in Sackett et al. (Reference Sackett, Zhang, Berry and Lievens2023). Effect sizes for the five-factor model traits are based on contextualized personality items. Situational judgment test (SJT), general mental ability (GMA).
As you can see, these benchmarks are considerably smaller than the Cohen ones and, as such, change our definition of which predictors have small, medium, and large effects. We point out some implications of using these revised effect size benchmarks for communicating predictor effect size.
Implications for communicating our value to stakeholders
Our predictors are powerful
Many of the predictors developed by I–O psychologists for employee selection are among the strongest behavioral predictors in all of applied psychology. Moreover, as Lievens (Reference Lievens2013; Lievens et al., Reference Lievens, Sackett and Zhang2020) pointed out, the efficacy of selection procedures for predicting job success is often equal to or greater than the efficacy of things thought to be highly useful in medicine, including antihistamines for alleviating allergies and ibuprofen for reducing back pain.
Predictor rank can be deceiving
We believe that it is better to use the labels “small,” “medium,” and “large” from our Table 1 than it is to use meta-analytic correlations to compare predictors to one another. This is because, as Sackett et al., points out in the focal article, the standard deviations for many of the meta-analytic effects can be quite wide. In addition to this, Sackett et al. (Reference Sackett, Shewach and Keiser2017) showed that directly comparing validities requires holding constant sample and criterion—something that is rarely done in meta-analyses.
Smaller does not equal worse
It is unwise to rule out predictors because they have “merely” medium or small effect sizes. Barrick and Mount (Reference Barrick, Mount and Locke2000) argued that conscientiousness and emotional stability are necessary for almost every job. Although these may not be among the biggest predictors, the fact that they do not correlate with ability predictors make them especially useful for incremental prediction. Moreover, being categorized among small effect sizes means that a predictor significantly explains variance above chance levels. In contrast, things like handwriting analysis predict no better than chance (Rafaeli & Klimoski, Reference Rafaeli and Klimoski1983).
Perfect prediction should not be our standard
Perfect prediction in employee selection should not be the standard by which we judge our value, and it is not a relevant ceiling for use in communicating efficacy. We should focus communication on what value we add, not on how far we are from perfection. Consider including meaningful context, such as what the validity would be if one hired at random, as a baseline, and use a reasonable upper bound based on our current understanding of the validity ceiling.
We can do a better job communicating our value
Researchers have recently demonstrated that consumers of effect size information are more persuaded by correlations presented in the form of probabilities and/or frequencies (Brooks et al., Reference Brooks, Dalal and Nolan2014; Zhang et al., Reference Zhang, Highhouse, Brooks and Zhang2018). Graphical visual aids (e.g., icon arrays, expectancy charts) have also been shown to be useful for communicating efficacy (Garcia-Retamero, & Cokely, Reference Garcia-Retamero and Cokely2013; Zhang et al., Reference Zhang, Highhouse, Brooks and Zhang2018). If you want to present traditional validity coefficients, context can improve information evaluability of otherwise hard-to-evaluate relations (Childers et al., Reference Childers, Highhouse and Brooks2022; Zikmund-Fisher, Reference Zikmund-Fisher2019).
Concluding thoughts
As we have noted elsewhere (Highhouse & Brooks, Reference Highhouse, Brooks, Collings, Mellahi and Cascio2017, Reference Highhouse and Brooks2023), effective hiring requires assessing what is foreseeable at the time of hire, recognizing that the ultimate outcome may be influenced by various things outside of the employer’s control. This means that, considering all life, workplace, and random factors that may influence performance, the theoretical ceiling of predictive validity of pre-employment tests is necessarily limited. The view of selection as probabilistic and prone to error is rejected by some I–O psychologists who believe that near perfect prediction is theoretically possible (e.g., Hollenbeck, Reference Hollenbeck2009; Silzer & Jeanneret, Reference Silzer and Jeanneret2011). Einhorn (Reference Einhorn1986) wisely observed, however, that good prediction requires “accepting error to make less error” (p. 387). It is necessary, therefore, that I–O psychologists be unabashed in advocating for our predictors that are proven to be powerful forecasters of future performance. We should focus on competing with those who pedal inferior alternatives to our selection tools rather than competing with impossible standards of success.