A Course in the Theory of Mental Tests

Harold Gulliksen

doi:10.1007/BF02288706

A Course in the Theory of Mental Tests

Published online by Cambridge University Press: 01 January 2025

Harold Gulliksen

Show author details

Harold Gulliksen*: Affiliation:
Psychology Department, The University of Chicago

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

An outline for a course in test theory is presented, together with a list of assignments, problems, and a bibliography. The course has been given in the Psychology Department of the University of Chicago. The material is presented in outline form at the present time because of the increased need for training in test theory due to the increase in the use of psychological tests for classification of military personnel, and because much of the material in such a course must be selected from a wide array of articles in the literature. This material is presented in order that an organized body of material for instructional purposes may be readily available to those interested.

Type: Original Paper
Information: Psychometrika , Volume 8 , Issue 4 , December 1943 , pp. 223 - 245

DOI: https://doi.org/10.1007/BF02288706 [Opens in a new window]
Copyright: Copyright © 1943 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

On leave from the University of Chicago for a government research project at the College Entrance Examination Board, Princeton, New Jersey.

References

Adkins, Dorothy C. A comparative study of methods of selecting items. Dissertation on file in library of Ohio State University. Abstract, Psychology Library.Google Scholar

Adkins, Dorothy C., and Toops, Herbert A., Simplified formulas for item selection and construction. Psychometrika, 1937, 2, 165–171.CrossRef Google Scholar

Ayres, Leonard P., A scale for measuring the quality of handwriting of school children, New York: Publication on Measurement in Education, Division of Education, 1911.Google Scholar

Babitz, Milton, and Keys, Noel. A method for approximating the average intercorrelation coefficient by correlating the parts with the sum of the parts. Psychometrika, 1940, 5, 283–288.CrossRef Google Scholar

Board of Examinations, The University of Chicago. Manual of Examination Methods Second Edition, (pp. 177–177). Chicago: Univ. Chicago Bookstore, 1937.Google Scholar

Boring, E. G. Mathematical vs. scientific significance. Psychol. Bull., 1919, 16, 335–338.CrossRef Google Scholar

Boring, E. G. The logic of the normal law of error in mental measurement. Amer. J. Psychol., 1920, 31, 1–33.CrossRef Google Scholar

Bradford, Leland P. The effect of practice upon standard errors of estimate. Psychol. Monogr., 1940, 52(3), 56–71.CrossRef Google Scholar

Brown, William, and Thomson, Godfrey. 1921. Essentials of mental measurement. Cambridge Univ. Press. Pp. viii + 216.CrossRef Google Scholar

Buros, Oscar K. Educational, Psychological, and Personality Tests of 1933, 1934, and 1935 (pp. 83–83). New Brunswick, New Jersey: School of Education, Rutgers University, 1936.Google Scholar

Buros, Oscar K. Educational, Psychological, and Personality Tests of 1936 (pp. 141–141). New Brunswick, New Jersey: School of Education, Rutgers University, 1937.Google Scholar

Buros, Oscar K. The 1938 Mental Measurements Yearbook (pp. xiv–xiv). New Brunswick, New Jersey: School of Education, Rutgers University, 1938.Google Scholar

Buros, Oscar K. The 1940 Mental Measurements Yearbook (pp. xxi–xxi). New Brunswick, New Jersey: School of Education, Rutgers University, 1941.Google Scholar

Burt, Cyril. In Hartog, P. J., and Rhodes, E. C.(Eds.), The Marks of Examiners (pp. xix–xix). London: Macmillan and Company, 1936.Google Scholar

Douglass, H. R. Some observations and data on certain methods of measuring the predictive significance of the Pearson product-moment coefficient of correlation. J. educ. Psychol., 1934, 25, 225–232.CrossRef Google Scholar

Dressel, Paul L. Some remarks on the Kuder-Richardson reliability coefficient. Psychometrika, 1940, 5, 305–310.CrossRef Google Scholar

Dunlap, Jack W. Note on the computation of bi-serial correlation in item evaluation. Psychometrika, 1936, 1, 51–58.CrossRef Google Scholar

Dunlap, Jack W. Nomograph for computing bi-serial correlations. Psychometrika, 1939, 1, 59–60.CrossRef Google Scholar

Dunlap, Jack and Kurtz, A. K. Handbook of statistical nomographs and formulas (pp. vii–vii). New York: World Book Company, 1932.Google Scholar

Edgerton, H. A. and Toops, H. A. A formula for finding the average inter-correlation coefficient for unranked raw scores without solving any of the individual intercorrelations. J. educ. Psychol., 1928, 19, 131–138.CrossRef Google Scholar

Edgerton, H. A. and Kolbe, Laverne E. The method of minimum variation for the combination of criteria. Psychometrika, 1936, 1, 183–187.CrossRef Google Scholar

Englehart, Max D. Unique types of achievement test exercises. Psychometrika, 1942, 7, 103–115.CrossRef Google Scholar

Flanagan, John C. A short method for selecting the best combination of test items for a particular purpose. Psychol. Bull., 1936, 33, 603–604.Google Scholar

Flanagan, John C. Scaled scores, New York: Cooperative Test Service, 1939.Google Scholar

Freeman, Frank N. A critique of the Yerkes-Bridges-Hardwick comparison of the Binet-Simon and point scales. Psychol. Rev., 1917, 24, 484–484.CrossRef Google Scholar

Freeman, Frank N. Mental tests: Their history, principles, and applications, Cambridge, Mass.: The Riverside Press, Rev., 1939.Google Scholar

Frisch, Ragnar. 1934. Statistical confluence analysis by means of complete regression systems. Oslo.Google Scholar

Garrett, Henry E. The discriminant function and its use in psychology. Psychometrika, 1943, 8, 65–79.CrossRef Google Scholar

Guilford, J. P. The determination of item difficulty when chance success is a factor. Psychometrika, 1936, 1, 259–264.CrossRef Google Scholar

Guilford, J. P. Psychometric methods, New York: McGraw-Hill, 1936.Google Scholar

Guilford, J. P. The psychophysics of mental test difficulty. Psychometrika, 1937, 2, 121–133.CrossRef Google Scholar

Gulliksen, Harold. The content reliability of a test. Psychometrika, 1936, 1, 189–194.CrossRef Google Scholar

Hawkes, H. E., Lindquist, E. F., and Mann, C. R. The construction and use of achievement examinations, Boston: Houghton-Mifflin Company, 1936.Google Scholar

Hildreth, G. H. A bibliography of mental tests and rating scales 2nd Ed., (pp. xxiv–xxiv). New York: The Psychological Corporation, 1939.Google Scholar

Holmes, Henry W. A descriptive bibliography of measurement in elementary subjects, Cambridge, Mass.: Harvard Univ. Press, 1917.Google Scholar

Holzinger, Karl J., and Clayton, Blythe. Further experiments in the application of Spearman's prophecy formula. J. educ. Psychol., 1925, 16, 289–299.CrossRef Google Scholar

Horst, Paul. Item selection by the method of successive residuals. J. exper. Educ., 1934, 2, 254–263.CrossRef Google Scholar

Horst, Paul. Increasing the efficiency of selection tests. The Personnel Journal, 1934, 12, 254–259.Google Scholar

Horst, Paul. Obtaining a composite measure from different measures of the same attributes. Psychometrika, 1936, 1, 53–60.CrossRef Google Scholar

Horst, Paul. Item selection by means of a maximizing function. Psychometrika, 1936, 1, 229–244.CrossRef Google Scholar

Hull, Clark L. Aptitude testing (pp. xiv–xiv). New York: World Book Company, 1928.CrossRef Google Scholar

Kelley, Truman L. Interpretation of educational measurements, New York: World Book Company, 1927.Google Scholar

Kelley, Truman L. Statistical methods (pp. xi–xi). New York: Macmillan Company, 1924.Google Scholar

Kelley, Truman L. The principles and techniques of mental measurement. Amer. J. Psychol., 1923, 34, 408–432.CrossRef Google Scholar

Kuder, G. F. Nomograph for point biserial r, biserial r, and fourfold correlations. Psychometrika, 1937, 2, 135–138.CrossRef Google Scholar

Kuder, G. F. Richardson, M. W. The theory of the estimation of test reliability. Psychometrika, 1937, 2, 151–160.CrossRef Google Scholar

Lee, J. M., and Symonds, P. M. New type or objective tests: a summary of investigations (Oct. 1931-Oct. 1933). J. educ. Psychol., 1934, 25, 161–184.CrossRef Google Scholar

Lentz, T. F., Hirshstein, Bertha, and Finch, J. H. Evaluation of methods of evaluating test items. J. educ. Psychol., 1932, 23, 344–350.CrossRef Google Scholar

Lindquist, E. F. Statistical analysis in educational research, New York: Houghton Mifflin Co., 1940.Google Scholar

Long, John A., Sandiford, Peter, et al. 1935. The validation of test items. Bull. Dept. Educ. Res., Ontario Coll. Educ., No. 3, 126 pages.Google Scholar

McCall, W. A. How to measure in education (pp. xii–xii). New York: The Macmillan Company, 1922.Google Scholar

Merrill, Walter W. Jr.. Sampling theory in item analysis. Psychometrika, 1937, 2, 215–224.CrossRef Google Scholar

Monroe, Paul (Editor). Conference on examinations at Dinard, France, Sept. 16–19, 1938 (pp. xiii–xiii). New York: Bureau of Publications, Teachers College, Columbia University, 1939.Google Scholar

Monroe, Walter S. The theory of educational measurements, New York: Houghton-Mifflin Company, 1923.Google Scholar

Monroe, Walter S. A note on efiiciency of prediction. J. educ. Psychol., 1934, 25, 547–548.CrossRef Google Scholar

Moore, Clarence Carl. The rights-minus wrongs method of correcting chance factors in the T-F examination. J. genet. Psychol., 1940, 57, 317–326.Google Scholar

Monroe, Walter S. and Englehart, Max D. Scientific study of educational problems, New York: The Macmillan Company, 1936.Google Scholar

Mosier, Charles I. A note on item analysis and the criterion of internal consistency. Psychometrika, 1936, 1, 275–282.CrossRef Google Scholar

Mosier, Charles I. Psychophysics and mental test theory: fundamental postulates and elementary theorems. Psychol. Rev., 1940, 47, 355–366.CrossRef Google Scholar

National Society for the Study of Education. 17th Yearbook, Part II, Bloomington, Ill.: Public School Publishing Company, 1918.Google Scholar

Orleans, Jacob S. Measurement in education (pp. xvi–xvi). New York: Thomas Nelson and Sons., 1937.Google Scholar

Otis, A. S. The method for finding the correspondence between scores in two tests. J. educ. Psychol., 1922, 13, 529–545.CrossRef Google Scholar

Otis, A. S. A method of inferring the change in a coefficient of correlation resulting from a change in the heterogeneity of the group. J. educ. Psychol., 1922, 13, 293–294.CrossRef Google Scholar

Otis, A. S., and Knollin, H. E. The reliability of the Binet scale and of pedagogical scales. J. educ. Research, 1921, 4, 121–142.CrossRef Google Scholar

Richardson, M. W. Abac for computing tetrachoric coefficients in item analysis, Chicago: Univ. Chicago Board of Examination, 1935.Google Scholar

Richardson, M. W. Notes on the rationale of item analysis. Psychometrika, 1936, 1, 69–76.CrossRef Google Scholar

Richardson, M. W. The relation of difficulty to the differential validity of a test. Psychometrika, 1936, 1, 33–49.CrossRef Google Scholar

Richardson, M. W., and Adkins, Dorothy C. A rapid method of selecting test items. J. educ. Psychol., 1938, 29, 547–552.CrossRef Google Scholar

Richardson, M. W., and Kuder, G. F. The computation of test reliability by the method of rational equivalence. J. educ. Psychol., 1939, 30, 681–687.CrossRef Google Scholar

Ruch, G. M. and Stoddard, G. P. Test and measurements in high-school instruction, New York: World Book Company, 1927.CrossRef Google Scholar

Ruch, G. M., Ackerson, L., and Jackson, J. P. An empirical study of the Spearman-Brown formula as applied to educational test material. J. educ. Psychol., 1926, 17, 309–313.CrossRef Google Scholar

Ruger, Georgie J. Bibliography of psychological tests, New York: Bureau of Educational Measurements, 1918.Google Scholar

Rulon, Phillip J. A simplified procedure for determining the reliability of a test by split halves. Harvard educ. Rev., 1939, 9, 99–103.Google Scholar

Segel, David. A note of an error made in investigations of homogeneous grouping. J. educ. Psychol., 1933, 24, 64–66.CrossRef Google Scholar

Smith, B. O. Logical aspects of educational measurement, New York: Columbia Univ. Press, 1938.CrossRef Google Scholar

Spearman, Charles. Correlation from faulty data. Brit. J. Psychol., 1910, 3, 271–295.Google Scholar

Spearman, Charles. Demonstration of formulae for true measurement of correlation. Amer. J. Psychol., 1907, 18, 161–169.CrossRef Google Scholar

Stalnaker, J. M. Weighting questions in the essay-type examination. J. educ. Psychol., 1938, 29, 481–490.CrossRef Google Scholar

Stalnaker, J. M. and Richardson, M. W. A note concerning the combination of test scores. J. gen. Psychol., 1933, 8, 460–463.CrossRef Google Scholar

Starch, D. and Elliot, E. C. 1912. Reliability of grading high-school work in English. School Review, September, 442–457.CrossRef Google Scholar

Starch, D. and Elliot, E. C. 1913. Reliability of grading high-school work in mathematics. School Review, April, 254–259.Google Scholar

Starch, D. and Elliot, E. C. 1913. Reliability of grading high-school work in history. School Review, December. 676–681.Google Scholar

Stern, William. The psychological methods of testing intelligence, Baltimore: Warwick and York, 1914.Google Scholar

Swineford, F. Validity of test items. J. educ. Psychol., 1936, 27, 68–78.CrossRef Google Scholar

Thorndike, E. L. On finding equivalent scores in tests of intelligence. J. appl. Psychol., 1922, 6, 29–33.CrossRef Google Scholar

Thurstone, L. L. A method for scoring tests. Psychol. Bull., 1919, 16, 235–240.CrossRef Google Scholar

Thurstone, L. L. A method of scaling psychological and educational tests. J. educ. Psychol, 1925, 16, 433–451.CrossRef Google Scholar

Thurstone, L. L. The mental age concept. Psychol. Rev., 1926, 33, 268–278.CrossRef Google Scholar

Thurstone, L. L. The absolute zero in intelligence measurement. Psychol. Rev., 1928, 35, 175–197.CrossRef Google Scholar

Thurstone, L. L. The reliability and validity of tests, Ann Arbor, Mich.: Edwards Brothers. Planographed, 1931.Google Scholar

Thurstone, L. L. Fundamentals of statistics (pp. xvi–xvi). New York: The Macmillan Company, 1924.Google Scholar

Thurstone, T. G. The difficulty of a test and its diagnostic value. J. educ. Psychol., 1932, 23, 335–343.CrossRef Google Scholar

Toops, H. A., and Symonds, P. M. What shall we expect of the A. Q.?. J. educ. Psychol., 1922, 13, 513–528.CrossRef Google Scholar

Travers, R. M. W. The use of a discriminant function in the treatment of psychological group differences. Psychometrika, 1939, 4, 25–32.CrossRef Google Scholar

Walker, Helen M. Studies in the history of statistical method (pp. 186–186). Baltimore: The Williams and Wilkins Company, 1929.Google Scholar

Whipple, Guy M. Manual of mental and physical tests, Baltimore: Warwick and York, 1910.CrossRef Google Scholar

Wilks, S. S. Weighting systems for linear functions of correlated variables when there is no dependent variable. Psychometrika, 1938, 3, 23–40.CrossRef Google Scholar

Yerkes, R. M., Bridges, J. W., and Hardwick, R. S. A point scale for measuring mental ability, Baltimore: Warwick and York, 1915.CrossRef Google Scholar PubMed

Article contents

A Course in the Theory of Mental Tests

Abstract

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests