Hostname: page-component-78c5997874-fbnjt Total loading time: 0 Render date: 2024-11-10T14:11:40.904Z Has data issue: false hasContentIssue false

Assessing the reliability of textbook data in syntax: Adger's Core Syntax1

Published online by Cambridge University Press:  08 February 2012

JON SPROUSE*
Affiliation:
Department of Cognitive Sciences, University of California, Irvine
DIOGO ALMEIDA*
Affiliation:
Department of Linguistics and Languages, Michigan State University
*
Authors' addresses: (Sprouse) Department of Cognitive Sciences, University of California, 3151 Social Science Plaza A, Irvine, CA 92697-5100, USAjsprouse@uci.edu
(Almeida) Department of Linguistics and Languages, Michigan State University, A-621 Wells Hall, East Lansing, MI 48824-1027, USAdiogo@msu.edu

Abstract

There has been a consistent pattern of criticism of the reliability of acceptability judgment data in syntax for at least 50 years (e.g., Hill 1961), culminating in several high-profile criticisms within the past ten years (Edelman & Christiansen 2003, Ferreira 2005, Wasow & Arnold 2005, Gibson & Fedorenko 2010, in press). The fundamental claim of these critics is that traditional acceptability judgment collection methods, which tend to be relatively informal compared to methods from experimental psychology, lead to an intolerably high number of false positive results. In this paper we empirically assess this claim by formally testing all 469 (unique, US-English) data points from a popular syntax textbook (Adger 2003) using 440 naïve participants, two judgment tasks (magnitude estimation and yes–no), and three different types of statistical analyses (standard frequentist tests, linear mixed effects models, and Bayes factor analyses). The results suggest that the maximum discrepancy between traditional methods and formal experimental methods is 2%. This suggests that even under the (likely unwarranted) assumption that the discrepant results are all false positives that have found their way into the syntactic literature due to the shortcomings of traditional methods, the minimum replication rate of these 469 data points is 98%. We discuss the implications of these results for questions about the reliability of syntactic data, as well as the practical consequences of these results for the methodological options available to syntacticians.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2012

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

[1]

This research was supported in part by National Science Foundation grant BCS-0843896 to Jon Sprouse. We would like to thank Carson Schütze, Colin Phillips, James Myers and three anonymous JL referees for helpful comments on earlier drafts. We would also like to thank Andrew Angeles, Melody Chen, and Kevin Proff for their assistance constructing materials. All errors remain our own.

References

Adger, David. 2003. Core syntax: A Minimalist approach. Oxford: Oxford University Press.CrossRefGoogle Scholar
Alexopoulou, Theodora & Keller, Frank. 2007. Locality, cyclicity and resumption: At the interface between the grammar and the human sentence processor. Language 83, 110160.CrossRefGoogle Scholar
Baayen, R. Harald. 2007. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.Google Scholar
Baayen, R. Harald, Davidson, Douglas J. & Bates, Douglas M.. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59, 390412.CrossRefGoogle Scholar
Bader, Marcus & Häussler, Jana. 2010. Toward a model of grammaticality judgments. Journal of Linguistics 46, 273330.CrossRefGoogle Scholar
Bard, Ellen G., Robertson, Dan & Sorace, Antonella. 1996. Magnitude estimation of linguistic acceptability. Language 72, 3268.CrossRefGoogle Scholar
Chomsky, Noam. 1955/1957. The logical structure of linguistic theory. New York: Plenum Press.Google Scholar
Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT Press.Google Scholar
Chomsky, Noam. 1986. Barriers. Cambridge, MA: MIT Press.Google Scholar
Cohen, Jacob. 1962. The statistical power of abnormal social psychological research: A review. Journal of Abnormal and Social Psychology 65, 145153.CrossRefGoogle ScholarPubMed
Cohen, Jacob. 1988. Statistical power analysis for the behavioral sciences, 2nd edn.Hillsdale, NJ: Erlbaum.Google Scholar
Cohen, Jacob. 1992. Statistical power analysis. Current Directions in Psychological Science 1, 98101.CrossRefGoogle Scholar
Cowart, Wayne. 1997. Experimental syntax: Applying objective methods to sentence judgments. Thousand Oaks, CA: Sage.Google Scholar
Culbertson, Jennifer & Gross, Steven. 2009. Are linguists better subjects? British Journal for the Philosophy of Science 60, 721736.CrossRefGoogle Scholar
Culicover, Peter W. & Jackendoff, Ray. 2010. Quantitative methods alone are not enough: Response to Gibson & Fedorenko. Trends in Cognitive Sciences 14, 234235.CrossRefGoogle Scholar
Dąbrowska, Ewa. 2010. Naïve v. expert intuitions: An empirical study of acceptability judgments. The Linguistic Review 27, 123.CrossRefGoogle Scholar
den Dikken, Marcel, Bernstein, Judy, Tortora, Christina & Zanuttini, Rafaella. 2007. Data and grammar: Means and individuals. Theoretical Linguistics 33, 335352.CrossRefGoogle Scholar
Edelman, Shimon & Christiansen, Morten. 2003. How seriously should we take Minimalist syntax? Trends in Cognitive Sciences 7, 6061.CrossRefGoogle ScholarPubMed
Fanselow, Gisbert. 2007. Carrots – perfect as vegetables, but please not as a main dish. Theoretical Linguistics 33, 353367.CrossRefGoogle Scholar
Featherston, Sam. 2005a. Magnitude estimation and what it can do for your syntax: Some wh-constraints in German. Lingua 115, 15251550.CrossRefGoogle Scholar
Featherston, Sam. 2005b. Universals and grammaticality: Wh-constraints in German and English. Linguistics 43, 667711.CrossRefGoogle Scholar
Featherston, Sam. 2007. Data in generative grammar: The stick and the carrot. Theoretical Linguistics 33, 269318.CrossRefGoogle Scholar
Featherston, Sam. 2008. Thermometer judgments as linguistic evidence. In Riehl, Claudia Maria & Rothe, Astrid (eds.), Was ist linguistische evidenz?, 6990. Aachen: Shaker Verlag.Google Scholar
Featherston, Sam. 2009. Relax, lean back, and be a linguist. Zeitschrift für Sprachwissenschaft 28, 127132.CrossRefGoogle Scholar
Ferreira, Fernanda. 2005. Psycholinguistics, formal grammars, and cognitive science. The Linguistic Review 22, 365380.CrossRefGoogle Scholar
Gallistel, Randy. 2009. The importance of proving the null. Psychological Review 116, 439–53.CrossRefGoogle ScholarPubMed
Gibson, Edward. 1991. A computational theory of human linguistic processing: Memory limitations and processing breakdown. Ph.D. dissertation, Carnegie Mellon University.Google Scholar
Gibson, Edward & Fedorenko, Evelina. 2010. Weak quantitative standards in linguistics research. Trends in Cognitive Sciences 14, 233234.CrossRefGoogle ScholarPubMed
Gibson, Edward & Fedorenko, Evelina. In press. The need for quantitative methods in syntax and semantics research. Language and Cognitive Processes, doi:10.1080/01690965.2010.515080. Published online by Taylor & Francis, 4 May 2011.Google Scholar
Gibson, Edward, Piantadosi, Steve & Fedorenko, Kristina. 2011. Using Mechanical Turk to obtain and analyze English acceptability judgments. Language and Linguistics Compass 5, 509524.CrossRefGoogle Scholar
Grewendorf, Günter. 2007. Empirical evidence and theoretical reasoning in generative grammar. Theoretical Linguistics 33, 369381.CrossRefGoogle Scholar
Gross, Steven & Culbertson, Jennifer. 2011. Revisited linguistic intuitions. British Journal for the Philosophy of Science 62, 639656.CrossRefGoogle Scholar
Haider, Hubert. 2007. As a matter of facts – comments on Featherston's sticks and carrots. Theoretical Linguistics 33, 381395.CrossRefGoogle Scholar
Hill, Archibald A. 1961. Grammaticality. Word 17, 110.CrossRefGoogle Scholar
Hofmeister, Philip, Jaeger, T. Florian, Arnon, Inbal, Sag, Ivan A. & Snider, Neal. In press. The source ambiguity problem: Distinguishing the effects of grammar and processing on acceptability judgments. Language and Cognitive Processes, doi: 10.1080/01690965.2011.572401. Published online by Taylor & Francis, 18 October 2011.Google Scholar
Jeffreys, Harold. 1961. Theory of probability. Oxford: Oxford University Press.Google Scholar
Kayne, Richard S. 1983. Connectedness. Linguistic Inquiry 14, 223249.Google Scholar
Keller, Frank. 2000. Gradience in grammar: Experimental and computational aspects of degrees of grammaticality. PhD. dissertation, University of Edinburgh.Google Scholar
Keller, Frank. 2003. A psychophysical law for linguistic judgments. In Alterman, Richard & Kirsh, David (eds.), The 25th Annual Conference of the Cognitive Science Society, 652657. Boston.Google Scholar
Myers, James. 2009. Syntactic judgment experiments. Language and Linguistics Compass 3, 406423.CrossRefGoogle Scholar
Newmeyer, Frederick J. 2007. Commentary on Sam Featherston, ‘Data in generative grammar: The stick and the carrot’. Theoretical Linguistics 33, 395399.CrossRefGoogle Scholar
Nickerson, Raymond. 2000. Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods 5, 241301.CrossRefGoogle ScholarPubMed
Pesetsky, David. 1987. WH-in-situ: Movement and unselective binding. In Reuland, Eric & ter Meulen, Alice G. B. (eds.), The linguistic representation of (in)definiteness, 98129. Cambridge, MA: MIT Press.Google Scholar
Phillips, Colin. 2009. Should we impeach armchair linguists? In Iwasaki, Shoishi, Hoji, Hajime, Clancy, Patricia & Sohn, Sung-Ock (eds.), Japanese/Korean Linguistics 17. Stanford, CA: CSLI Publications.Google Scholar
Phillips, Colin & Lasnik, Howard. 2003. Linguistics and empirical evidence: Reply to Edelman and Christiansen. Trends in Cognitive Sciences 7, 6162.CrossRefGoogle ScholarPubMed
Rouder, Jeffrey N., Speckman, Paul L., Sun, Dongchu, Morey, Richard D. & Iverson, Geoffrey. 2009. Bayesian t-tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review 16, 225237.CrossRefGoogle ScholarPubMed
Schütze, Carson. 1996. The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Chicago: University of Chicago Press.Google Scholar
Schütze, Carson & Sprouse, Jon. In press. Judgment data. In Sharma, Devyani & Podesva, Rob (eds.), Research methods in linguistics. Cambridge: Cambridge University Press.Google Scholar
Sorace, Antonella & Keller, Frank. 2005. Gradience in linguistic data. Lingua 115, 14971524.CrossRefGoogle Scholar
Spencer, N. J. 1973. Differences between linguists and nonlinguists in intuitions of grammaticality-acceptability. Journal of Psycholinguistic Research 2, 8398.CrossRefGoogle ScholarPubMed
Sprouse, Jon. 2007a. A program for experimental syntax. Ph.D. dissertation, University of Maryland.Google Scholar
Sprouse, Jon. 2007b. Continuous acceptability, categorical grammaticality, and experimental syntax. Biolinguistics 1, 118129.CrossRefGoogle Scholar
Sprouse, Jon. 2008. The differential sensitivity of acceptability to processing effects. Linguistic Inquiry 39, 686694.CrossRefGoogle Scholar
Sprouse, Jon. 2009. Revisiting satiation: Evidence for an equalization response strategy. Linguistic Inquiry 40, 329341.CrossRefGoogle Scholar
Sprouse, Jon. 2011a. A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory. Behavior Research Methods 43, 155167.CrossRefGoogle ScholarPubMed
Sprouse, Jon. 2011b. A test of the cognitive assumptions of Magnitude Estimation: Commutativity does not hold for acceptability judgments. Language 87, 274288.CrossRefGoogle Scholar
Sprouse, Jon & Almeida, Diogo. 2012. The role of experimental syntax in an integrated cognitive science of language. In Grohmann, Kleanthes & Boeckx, Cedric (eds.), The Cambridge handbook of biolinguistics. Cambridge: Cambridge University Press.Google Scholar
Sprouse, Jon & Almeida, Diogo. 2011. Power in acceptability judgment experiments and the reliability of data in syntax. Ms., University of California, Irvine & Michigan State University.Google Scholar
Sprouse, Jon, Fukuda, Shin, Ono, Hajime & Kluender, Robert. 2011. Grammatical operations, parsing processes, and the nature of wh-dependencies in English and Japanese. Syntax 14, 179203.CrossRefGoogle Scholar
Sprouse, Jon, Schütze, Carson & Almeida, Diogo. 2011. Assessing the reliability of journal data in syntax: Linguistic Inquiry 2001–2010. Ms., University of California, Irvine; University of California, Los Angeles & Michigan State University.Google Scholar
Sprouse, Jon, Wagers, Matt & Phillips, Colin. 2012. A test of the relation between working memory capacity and island effects. Language 88.1.Google Scholar
Stevens, Stanley Smith. 1957. On the psychophysical law. Psychological Review 64, 153181.CrossRefGoogle ScholarPubMed
Wasow, Thomas & Arnold, Jennifer. 2005. Intuitions in linguistic argumentation. Lingua 115, 14811496.CrossRefGoogle Scholar
Weskott, Thomas & Fanselow, Gisbert. 2011. On the informativity of different measures of linguistic acceptability. Language 87, 249273.CrossRefGoogle Scholar
Wetzels, Ruud, Raaijmakers, Jeroen G. W., Jakab, Emöke & Wagenmakers, Eric-Jan. 2009. How to quantify support for and against the null hypothesis: A flexible WinBUGS implementation of a default Bayesian t-test. Psychonomic Bulletin & Review 16, 752760.CrossRefGoogle ScholarPubMed