Time To Change the Bathwater: Correcting Misconceptions About Performance Ratings

C. Allen Gorman; Christopher J. L. Cunningham; Shawn M. Bergman; John P. Meriac

doi:10.1017/iop.2016.17

Time To Change the Bathwater: Correcting Misconceptions About Performance Ratings

Published online by Cambridge University Press: 04 July 2016

C. Allen Gorman ,

Christopher J. L. Cunningham ,

Shawn M. Bergman and

John P. Meriac

Show author details

C. Allen Gorman*: Affiliation:
Department of Management and Marketing, East Tennessee State University, and GCG Solutions, Limestone, Tennessee
Christopher J. L. Cunningham: Affiliation:
Department of Psychology, The University of Tennessee at Chattanooga, and Logi-Serve, Farmington Hills, Michigan
Shawn M. Bergman: Affiliation:
Department of Psychology, Appalachian State University, and B&F Associates, Boone, North Carolina
John P. Meriac: Affiliation:
Department of Psychology, University of Missouri–St. Louis
*: Correspondence concerning this article should be addressed to C. Allen Gorman, Department of Management and Marketing, East Tennessee State University, 128 Sam Wilson Hall, P.O. Box 70625, Johnson City, TN 37614. E-mail: gormanc@etsu.edu

Article contents

Extract
References

Get access

Rights & Permissions

Extract

Recent commentary has suggested that performance management (PM) is fundamentally “broken,” with negative feelings from managers and employees toward the process at an all-time high (Pulakos, Hanson, Arad, & Moye, 2015; Pulakos & O'Leary, 2011). In response, some high-profile organizations have decided to eliminate performance ratings altogether as a solution to the growing disenchantment. Adler et al. (2016) offer arguments both in support of and against eliminating performance ratings in organizations. Although both sides of the debate in the focal article make some strong arguments both for and against utilizing performance ratings in organizations, we believe there continue to be misunderstandings, mischaracterizations, and misinformation with respect to some of the measurement issues in PM. We offer the following commentary not to persuade readers to adopt one particular side over another but as a call to critically reconsider and reevaluate some of the assumptions underlying measurement issues in PM and to dispel some of the pervasive beliefs throughout the performance rating literature.

Type: Commentaries
Information: Industrial and Organizational Psychology , Volume 9 , Issue 2 , June 2016 , pp. 314 - 322

DOI: https://doi.org/10.1017/iop.2016.17 [Opens in a new window]
Copyright: Copyright © Society for Industrial and Organizational Psychology 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adler, S., Campion, M., Colquitt, A., Grubb, A., Murphy, K., Ollander-Krane, R., & Pulakos, E. D. (2016). Getting rid of performance ratings: Genius or folly? A debate. Industrial and Organizational Psychology: Perspectives on Science and Practice, 9 (2), 219–252.CrossRef Google Scholar

Austin, J. T., & Villanova, P. (1992). The criterion problem: 1917–1992. Journal of Applied Psychology, 77 (6), 836–874.CrossRef Google Scholar

Balzer, W. K., & Sulsky, L. M. (1992). Halo and performance appraisal research: A critical examination. Journal of Applied Psychology, 77 (6), 975–985.CrossRef Google Scholar

Bartram, D. (2007). Increasing validity with forced-choice criterion measure formats. International Journal of Selection and Assessment, 15, 263–272.CrossRef Google Scholar

Benson, P. G., Buckley, M. R., & Hall, S. (1988). The impact of rating scale format on rater accuracy: An evaluation of the mixed standard scale. Journal of Management, 14, 415–423.CrossRef Google Scholar

Bernardin, H. J., & Pence, E. C. (1980). Effects of rater training: Creating new response sets and decreasing accuracy. Journal of Applied Psychology, 65, 60–66.CrossRef Google Scholar

Borman, W. C. (1979). Format and training effects on rating accuracy and rater errors. Journal of Applied Psychology, 64 (4), 410–421.CrossRef Google Scholar

Borman, W. C., Buck, D. E., Hanson, M. A., Motowidlo, S. J., Stark, S., & Drasgow, F. (2001). An examination of the comparative reliability, validity, and accuracy of performance ratings made using computerized adaptive rating scales. Journal of Applied Psychology, 86, 965–973.CrossRef Google Scholar PubMed

DeNisi, A. S. (1996). Cognitive approach to performance appraisal: A program of research. New York, NY: Taylor Francis.Google Scholar

DeNisi, A. S., & Pritchard, R. D. (2006). Performance appraisal, performance management, and improving individual performance: A motivational framework. Management and Organization Review, 2, 253–277.CrossRef Google Scholar

Fisicaro, S. A. (1988). A reexamination of the relation between halo error and accuracy. Journal of Applied Psychology, 73, 239–244.CrossRef Google Scholar

Fletcher, C. (2001). Performance appraisal and management: The developing research agenda. Journal of Occupational and Organizational Psychology, 74, 473–487.CrossRef Google Scholar

Goffin, R. D., Gellatly, I. R., Paunonen, S. V., Jackson, D. N., & Meyer, J. P. (1996). Criterion validation of two approaches to performance appraisal: The behavioral observation scale and the relative percentile method. Journal of Business and Psychology, 11, 23–33.CrossRef Google Scholar

Gorman, C. A., & Jackson, D. J. R. (2012, April). A generalizability theory approach to understanding frame-of-reference rater training effectiveness. In Gibbons, A. (Chair), Inside assessment centers: New insights about assessors, dimensions, and exercises. Symposium presented at the 27th Annual Conference of the Society for Industrial and Organizational Psychology, San Diego, CA.Google Scholar

Gorman, C. A., Meriac, J. P., Ray, J. L., & Roddy, T. W. (2015). Current trends in rater training: A survey of rater training programs in American organizations. In O'Leary, B. J., Weathington, B. L., Cunningham, C. J. L., & Biderman, M. D. (Eds.), Trends in training (pp. 1–23). Newcastle upon Tyne, UK: Cambridge Scholars.Google Scholar

Hoffman, B. J., Gorman, C. A., Blair, C. A., Meriac, J. P., Overstreet, B. L., & Atchley, E. K. (2012). Evidence for the effectiveness of an alternative multisource performance rating methodology. Personnel Psychology, 65, 531–563.CrossRef Google Scholar

Hoffman, B., Lance, C. E., Bynum, B., & Gentry, W. A. (2010). Rater source effects are alive and well after all. Personnel Psychology, 63 (1), 119–151.CrossRef Google Scholar

Jawahar, I. M., & Williams, C. R. (1997). Where all the children are above average: The performance appraisal purpose effect. Personnel Psychology, 50 (4), 905–926.CrossRef Google Scholar

Lance, C. E., Hoffman, B. J., Gentry, W. A., & Baranik, L. E. (2008). Rater source factors represent important subcomponents of the criterion construct space, not rater bias. Human Resource Management Review, 18 (4), 223–232.CrossRef Google Scholar

Lance, C. E., & Woehr, D. J. (1989). The validity of performance judgments: Normative accuracy model versus ecological perspectives. In Ray, D. F. (Ed.), Southern Management Association Proceedings (pp. 115–117). Oxford, MS: Southern Management Association.Google Scholar

Landy, F. J. (2010). Performance ratings: Then and now. In Outz, J. L. (Ed.), Adverse impact: Implications for organizational staffing and high stakes selection (pp. 227–248). New York, NY: Routledge.Google Scholar

Landy, F. J., & Farr, J. L. (1980). Performance rating. Psychological Bulletin, 87, 72–107.CrossRef Google Scholar

LeBreton, J. M., Burgess, J. R. D., Kaiser, R. B., Atchley, E. K., & James, L. R. (2003). The restriction of variance hypothesis and interrater reliability and agreement: Are ratings from multiple sources really dissimilar? Organizational Research Methods, 6, 80–128.CrossRef Google Scholar

LeBreton, J. M., Scherer, K. T., & James, L. R. (2014). Corrections for criterion reliability in validity generalization: A false prophet in a land of suspended judgment. Industrial and Organizational Psychology: Perspectives on Science and Practice, 7, 478–500.Google Scholar

Levy, P. E. (2010). Industrial/organizational psychology (3rd ed.). New York, NY: Worth.Google Scholar

Lievens, F. (2001). Assessor training strategies and their effects on accuracy, interrater reliability, and discriminant validity. Journal of Applied Psychology, 86, 255–264.CrossRef Google Scholar PubMed

London, M., Smither, J. W., & Adsit, D. J. (1997). Accountability: The Achilles’ heel of multisource feedback. Group & Organization Management, 22 (2), 162–184.CrossRef Google Scholar

Meriac, J. P., Gorman, C. A., & Macan, T. (2015). Seeing the forest but missing the trees: The role of judgments in performance management. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 102–108.CrossRef Google Scholar

Murphy, K. R. (2008). Explaining the weak relationship between job performance and ratings of job performance. Industrial and Organizational Psychology: Perspectives on Research and Practice, 1, 148–160.CrossRef Google Scholar

Murphy, K. R., & Balzer, W. K. (1989). Rater errors and rating accuracy. Journal of Applied Psychology, 74 (4), 619–624.CrossRef Google Scholar

Murphy, K. R., & Cleveland, J. N. (1995). Understanding performance appraisal: Social, organizational, and goal-based perspectives. Thousand Oaks, CA: Sage.Google Scholar

Murphy, K. R., & DeShon, R. (2000). Progress in psychometrics: Can industrial and organizational psychology catch up? Personnel Psychology, 53 (4), 913–924.CrossRef Google Scholar

Murphy, K. R., Jako, R. A., & Anhalt, R. L. (1993). Nature and consequences of halo error: A critical analysis. Journal of Applied Psychology, 78 (2), 218–225.CrossRef Google Scholar

Nathan, B. R., & Tippins, N. (1990). The consequences of halo “error” in performance ratings: A field study of the moderating effect of halo on test validation results. Journal of Applied Psychology, 75, 290–296.CrossRef Google Scholar

Noonan, L. E., & Sulsky, L. M. (2001). Impact of frame-of-reference and behavioral observation training on alternative training effectiveness criteria in a Canadian military sample. Human Performance, 14, 3–26.CrossRef Google Scholar

Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York, NY: McGraw-Hill.Google Scholar

Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (2008). No new terrain: Reliability and construct validity of job performance ratings. Industrial and Organizational Psychology, 1 (2), 174–179.CrossRef Google Scholar

Pulakos, E. D., Hanson, R. M., Arad, S., & Moye, N. (2015). Performance management can be fixed: An on-the-job experiential learning approach for complex behavior change. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 51–76.CrossRef Google Scholar

Pulakos, E. D., & O'Leary, R. S. (2011). Why is performance management broken? Industrial and Organizational Psychology: Perspectives on Science and Practice, 4, 146–164.CrossRef Google Scholar

Roch, S. G., Sternburgh, A. M., & Caputo, P. M. (2007). Absolute vs. relative performance rating formats: Implications for fairness and organizational justice. International Journal of Selection and Assessment, 15, 302–316.CrossRef Google Scholar

Roch, S. G., Woehr, D. J., Mishra, V., & Kieszczynska, U. (2011). Rater training revisited: An updated meta-analytic review of frame-of-reference training. Journal of Occupational and Organizational Psychology, 85, 370–395.CrossRef Google Scholar

Schmidt, F. L., Viswesvaran, C., & Ones, D. S. (2000). Reliability is not validity and validity is not reliability. Personnel Psychology, 53 (4), 901–912.CrossRef Google Scholar

Smith, D. E. (1986). Training programs for performance appraisal: A review. Academy of Management Review, 11, 22–40.CrossRef Google Scholar

Steelman, L. A., Levy, P. E., & Snell, A. F. (2004). The feedback environment scale: Construct definition, measurement, and validation. Educational and Psychological Measurement, 64 (1), 165–184.CrossRef Google Scholar

Tziner, A. (1984). A fairer examination of rating scales when used for performance appraisal in a real organizational setting. Journal of Organizational Behavior, 5, 103–112.CrossRef Google Scholar

Viswesvaran, C., Ones, D. S., & Schmidt, F. L. (1996). Comparative analysis of the reliability of job performance ratings. Journal of Applied Psychology, 81 (5), 557–574.CrossRef Google Scholar

Wagner, S. H., & Goffin, R. D. (1997). Differences in accuracy of absolute and comparative performance appraisal methods. Organizational Behavior and Human Decision Processes, 70, 95–103.CrossRef Google Scholar

Woehr, D. J., & Huffcutt, A. I. (1994). Rater training for performance appraisal: A quantitative review. Journal of Occupational and Organizational Psychology, 67, 189–205.CrossRef Google Scholar

Woehr, D. J., Sheehan, M. K., & Bennett, W. Jr. (2005). Assessing measurement equivalence across rating sources: A multitrait–multirater approach. Journal of Applied Psychology, 90 (3), 592–600.CrossRef Google Scholar PubMed

Article contents

Time To Change the Bathwater: Correcting Misconceptions About Performance Ratings

Extract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests