Skip to main content Accessibility help
×
Hostname: page-component-78c5997874-8bhkd Total loading time: 0 Render date: 2024-11-10T11:28:01.275Z Has data issue: false hasContentIssue false

8 - Comparing Bayesian and Frequentist Models of Language Variation

The Case of Help + (to-)Infinitive

from Part III - Perspectives on Multifactorial Methods

Published online by Cambridge University Press:  06 May 2022

Ole Schützler
Affiliation:
Universität Leipzig
Julia Schlüter
Affiliation:
Universität Bamberg
Get access

Summary

This chapter compares standard frequentist and more recent Bayesian approaches to logistic regression analyses. Starting out from a multifactorial case study of the verb help complemented by either the bare infinitive or the to-infinitive, the key components and the main conceptual differences of frequentist and Bayesian inference are discussed. Conceptually, the Bayesian rationale of directly testing hypotheses on the effects of multiple factors on an outcome variable is argued to be preferable and more sensitive than the conventional approach of testing null hypotheses. On the practical side, Bayesian statistics enables the researcher to recycle and integrate the results of previous analyses based on different datasets as informative priors, which can help improve and stabilize statistical modelling. Recourse to prior research can thus produce synergies and reduce data preparation expense. In cases of data sparsity, it can by the same token enable researchers to analyse small samples. Bayesian methods are thus put forward as powerful tools for overcoming the limitations of isolated corpus studies and for promoting synergies between data collected by individual researchers.

Type
Chapter
Information
Data and Methods in Corpus Linguistics
Comparative Approaches
, pp. 224 - 258
Publisher: Cambridge University Press
Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Further Reading

Gelman, Andrew, and Hill, Jennifer. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press.Google Scholar
Gelman, Andrew, Jakulin, Aleks, Pittau, Maria Grazia and Yu-Sung, Su. 2008. A Weakly Informative Prior Distribution for Logistic and Other Regression Models. The Annals of Applied Statistics 2(4). 136083. https://doi.org/10.1214/08-AOAS191.CrossRefGoogle Scholar
Kruschke, John K. 2011a. Doing Bayesian Data Analysis: A Tutorial with R and BUGS. Oxford: Elsevier.Google Scholar
McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Boca Raton, FL: CRC Press.Google Scholar
Nicenboim, Bruno, and Vasishth, Shravan. 2016. Statistical Methods for Linguistic Research: Foundational Ideas: Part II. Language and Linguistics Compass 10. 591613. https://doi.org/10.1111/lnc3.12207.Google Scholar
van de Schoot, Rens, and Depaoli, Sarah. 2014. Bayesian Analyses: Where to Start and What to Report. The European Health Psychologist 16(2). 7584.Google Scholar
van de Schoot, Rens, David Kaplan, Jaap J. Denissen, Jens B. Asendorpf, Franz J. Neyer, Marcel A. G. van Aken, . 2014. A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research. Child Development 85. 842860. https://doi.org/10.1111/cdev.12169CrossRefGoogle ScholarPubMed

References

Bartoń, Kamil. 2018. MuMIn: Multi-Model Inference. R package version 1.42.1. https://CRAN.R-project.org/package=MuMIn.Google Scholar
Bates, Douglas, Maechler, Martin, Bolker, Ben and Walker, Steve. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67(1). 148. https://doi.org/10.18637/jss.v067.i01.Google Scholar
Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan and Finegan, Edward. 1999. Longman Grammar of Spoken and Written English. Harlow: Longman.Google Scholar
Bürkner, Paul-Christian. 2017. brms: An R Package for Bayesian Multilevel Models Using Stan. Journal of Statistical Software 80(1). 128. https://doi.org/10.18637/jss.v080.i01.CrossRefGoogle Scholar
Davies, Mark. 2008–. The Corpus of Contemporary American English (COCA): 560 million words, 1990–present. https://corpus.byu.edu/coca/.Google Scholar
Dixon, R.M.W. 1991. A New Approach to English Grammar, on Semantic Principles. Oxford: Clarendon Press.Google Scholar
Fox, John. 2003. Effect Displays in R for Generalised Linear Models. Journal of Statistical Software 8(15). 127. www.jstatsoft.org/v08/i15.CrossRefGoogle Scholar
Gelman, Andrew, and Hill, Jennifer. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press.Google Scholar
Gelman, Andrew, Jakulin, Aleks, Pittau, Maria Grazia and Yu-Sung, Su. 2008. A Weakly Informative Prior Distribution for Logistic and Other Regression Models. The Annals of Applied Statistics 2(4). 1360–83. https://doi.org/10.1214/08-AOAS191.Google Scholar
Goodman, Steven. 2008. A Dirty Dozen: Twelve P-Value Misconceptions. Seminars in Hematology 45(3). 135–40. https://doi.org/10.1053/j.seminhematol.2008.04.003.Google Scholar
Goodman, Steven N., Fanelli, Daniele and John, P. A. Ioannidis. 2016. What Does Research Reproducibility Mean? Science Translational Medicine 8(341). 12. https://doi.org/10.1126/scitranslmed.aaf5027.CrossRefGoogle ScholarPubMed
Huddleston, Rodney, and Pullum, Geoffrey K.. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781316423530.Google Scholar
Kruschke, John K. 2011a. Doing Bayesian Data Analysis: A Tutorial with R and BUGS. Oxford: Elsevier.Google Scholar
Kruschke, John K. 2011b. Introduction to Special Section on Bayesian Data Analysis. Perspectives on Psychological Science 6(3). 272–3. https://doi.org/10.1177/1745691611406926.Google Scholar
Levshina, Natalia. 2016. When Variables Align: A Bayesian Multinomial Mixed-Effects Model of English Permissive Constructions. Cognitive Linguistics 27(2). 235–68.Google Scholar
Levshina, Natalia. 2018. Probabilistic Grammar and Constructional Predictability: Bayesian Generalized Additive Models of Help + (To) Infinitive in Varieties of Web-Based English. Glossa 3(1). 55. 122. https://doi.org/10.5334/gjgl.294/.Google Scholar
Levshina, Natalia. In press. Communicative Efficiency: Language Structure and Use. Cambridge: Cambridge University Press.Google Scholar
Lind, Age. 1983. The Variant Forms of Help to/Help Ø. English Studies 64. 263–75. https://doi.org/10.1080/00138388308598255.Google Scholar
Lohmann, Arne. 2011. Help vs. Help to: A Multifactorial, Mixed-Effects Account of Infinitive Marker Omission. English Language and Linguistics 15(3). 499521. https://doi.org/10.1017/S1360674311000141.CrossRefGoogle Scholar
Lunn, David, Jackson, Christopher, Best, Nicky, Thomas, Andrew and Spiegelhalter, David. 2013. The BUGS Book: A Practical Introduction to Bayesian Analysis. Boca Raton, FL: CRC Press.Google Scholar
Mair, Christian. 2002. Three Changing Patterns of Verb Complementation in Late Modern English: A Real-Time Study Based on Matching Text Corpora. English Language and Linguistics 6(1). 105–31. https://doi.org/10.1017/S1360674302001065.Google Scholar
McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Boca Raton, FL: CRC Press.Google Scholar
McEnery, Anthony and Xiao, Zhonghua. 2005. HELP or HELP to: What Do Corpora Have to Say? English Studies 86(2). 161–87. https://doi.org/10.1080/0013838042000339880.Google Scholar
Nakagawa, Shinichi, Johnson, Paul C. D. and Schielzeth, Holger. 2017. The Coefficient of Determination R2 and Intra-Class Correlation Coefficient from Generalized Linear Mixed-Effects Models Revisited and Expanded. Journal of The Royal Society Interface 14(134). http://doi.org/10.1098/rsif.2017.0213.Google Scholar
Nicenboim, Bruno, and Vasishth, Shravan. 2016. Statistical Methods for Linguistic Research: Foundational Ideas: Part II. Language and Linguistics Compass 10. 591613. https://doi.org/10.1111/lnc3.12207.Google Scholar
Rohdenburg, Günter. 1996. Cognitive Complexity and Increased Grammatical Explicitness in English. Cognitive Linguistics 7(2). 14982. https://doi.org/10.1515/cogl.1996.7.2.149.Google Scholar
Rohdenburg, Günter. 2003. Horror Aequi and Cognitive Complexity as Factors Determining the Use of Interrogative Clause Linkers. In Rohdenburg, Günter and Mondorf, Britta, eds. Determinants of Grammatical Variation in English. Berlin: Mouton de Gruyter. 205–50. https://doi.org/10.1515/9783110900019.205.CrossRefGoogle Scholar
Rohdenburg, Günter. 2009. Grammatical Divergence between British and American English in the Nineteenth and Early Twentieth Centuries. In Ingrid van Ostade, Tieken-Boon and van der Wurff, Wim, eds. Current Issues in Late Modern English. Linguistic Insights 77. Bern: Peter Lang. 301–30.Google Scholar
Schlüter, Julia. 2003. Phonological Determinants of Grammatical Variation in English: Chomsky’s Worst Possible Case. In Rohdenburg, Günter and Mondorf, Britta, eds. Determinants of Grammatical Variation in English. Berlin/New York, NY: Mouton de Gruyter. 69118.Google Scholar
Scrivner, Olga B. 2015. A Probabilistic Approach in Historical Linguistics: Word Order Change in Infinitival Clauses: From Latin to Old French. Doctoral dissertation. Indiana University.Google Scholar
Straka, Milan, and Jana, Straková. 2017. Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver, Canada, August 2017.Google Scholar
Van De Schoot, Rens, and Depaoli, Sarah. 2014. Bayesian Analyses: Where to Start and What to Report. The European Health Psychologist 16(2). 7584.Google Scholar
van de Schoot, Rens, David Kaplan, Jaap J. Denissen, Jens B. Asendorpf, Franz J. Neyer and Marcel A. G. van Aken. 2014. A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research. Child Development 85. 842–60. https://doi.org/10.1111/cdev.12169.Google Scholar
Vasishth, Shravan, Chen, Zhong, Qiang, Li and Guo, Guelian. 2013. Processing Chinese Relative Clauses: Evidence for the Subject-Relative Advantage. PLoS ONE 8(10). 114. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0077006.Google Scholar
Wasow, Thomas, Levy, Roger, Melnick, Robin, Zhu, Hanzhi and Juzek, Tom. 2015. Processing, Prosody, and Optional to. In Frazier, Lyn and Gibson, Edward, eds. Explicit and Implicit Prosody in Sentence Processing. New York: Springer. 133–58. https://doi.org/10.1007/978–3–319–12961–7_8.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×