Comparing Generalised Linear Mixed-Effects Models, Generalised Linear Mixed-Effects Model Trees and Random Forests

doi:10.1017/9781108589314.007

6 - Comparing Generalised Linear Mixed-Effects Models, Generalised Linear Mixed-Effects Model Trees and Random Forests

Filled and Unfilled Pauses in Varieties of English

from Part III - Perspectives on Multifactorial Methods

Published online by Cambridge University Press: 06 May 2022

Tobias Bernaisch

Edited by

Ole Schützler and

Julia Schlüter

Show author details

Ole Schützler: Affiliation:
Universität Leipzig
Julia Schlüter: Affiliation:
Universität Bamberg

Book contents

Get access

Summary

In a comparison of generalised linear mixed-effects models, generalised linear mixed-effects model trees and random forests, the author applies the three methodologies to a binary variable from the field of interactional pragmatics, the choice between filled and unfilled pauses across varieties of English represented by components of the International Corpus of English. Based on a large number of examples annotated for linguistic and extralinguistic factors the steps and decisions involved in the analyses are demonstrated. Though different in essence, the three resulting models share central trends. A more fine-grained evaluation of results and interpretations shows, however, that the three approaches differ in their systematicity of handling multiple observations from the same source, in that only the mixed-effects models explicitly account for and systematically partial out the relatedness of data points contributed by the same speaker. As to the way the approaches balance researcher involvement and control of the outcome, the approaches also differ substantially. A modelling choice can thus lead to notably different perspectives on an identical set of data and variables.

Keywords

classification accuracy mixed-effects model monofactorial tests random effects random forest classification tree

Information

Type: Chapter
Information: Data and Methods in Corpus Linguistics
Comparative Approaches
, pp. 163 - 193

DOI: https://doi.org/10.1017/9781108589314.007 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Anthony, Lawrence. 2017. AntConc (Version 3.5.7). Computer software. Tokyo: Waseda University.Google Scholar

Bernaisch, Tobias. 2015. The Lexis and Lexicogrammar of Sri Lankan English. Amsterdam: John Benjamins.Google Scholar

Bernaisch, Tobias, Gries, Stefan Th and Joybrato, Mukherjee. 2014. The Dative Alternation in South Asian Englishes: Modelling Predictors and Predicting Prototypes. English World-Wide 35(1). 7–31.Google Scholar

Breiman, Leo. 2001. Random Forests. Machine Learning 45. 5–32.CrossRef Google Scholar

Clark, Herbert, and Fox Tree, Jean E.. 2002. Using Uh and Um in Spontaneous Speaking. Cognition 84. 73–111.Google Scholar

Ephratt, Michal. 2008. The Functions of Silence. Journal of Pragmatics 40(11). 1909–38.CrossRef Google Scholar

Field, Andy, Miles, Jeremy and Field, Zoe. 2012. Discovering Statistics Using R. London: Sage.Google Scholar

Fokkema, Marjolein, Edbrooke-Childs, Julian and Wolpert, Miranda. 2021. Generalized Linear Mixed-Model (GLMM) Trees: A Flexible Decision-Tree Method for Multilevel and Longitudinal Data. Psychotherapy Research 31(3). 329–41.CrossRef Google Scholar

Fokkema, Marjolein, Smits, Niels, Zeileis, Achim, Hothorn, Torsten and Kelderman, Henk. 2018. Detecting Treatment-Subgroup Interactions in Clustered Data with Generalized Linear Mixed-Effects Model Trees. Behavior Research Methods 50(5). 2016–34.Google Scholar

Gilquin, Gaëtanelle. 2008. Hesitation Markers among EFL Learners: Pragmatic Deficiency or Difference. In Romero-Trillo, Jesús, ed. Pragmatics and Corpus Linguistics. Berlin: Mouton de Gruyter. 119–43.Google Scholar

Greenbaum, Sidney. 1991. The Development of the International Corpus of English. In Aijmer, Karin and Alternberg, Bengt, eds. English Corpus Linguistics: Studies in Honour of Jan Svartvik. London: Longman. 83–91.Google Scholar

Gries, Stefan Th. 2013. Statistics for Linguistics with R: A Practical Introduction. 2nd ed. Berlin: Mouton de Gruyter.Google Scholar

Gries, Stefan Th. 2018. On Over- and Underuse in Learner Corpus Research and Multifactoriality in Corpus Linguistics More Generally. Journal of Second Language Studies 1(2). 276–308.Google Scholar

Gries, Stefan Th., and Deshors, Sandra C.. 2014. Using Regressions to Explore Deviations between Corpus Data and a Standard/Target: Two Suggestions. Corpora 9(1). 109–36.Google Scholar

Harrell, Frank. 2015. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. 2nd ed. New York: Springer.Google Scholar

Heller, Benedikt. 2017. Stability and Fluidity in Syntactic Variation World-Wide: The Genitive Alternation across Varieties of English. Doctoral thesis. KU Leuven.CrossRef Google Scholar

Heller, Benedikt, Bernaisch, Tobias and Gries, Stefan Th.. 2017. Empirical Perspectives on Two Potential Epicenters: The Genitive Alternation in Asian Englishes. ICAME Journal 41. 111–44.Google Scholar

Hothorn, Torsten and Zeileis, Achim. 2015. Partykit: A Modular Toolkit for Recursive Partytioning in R. Journal of Machine Learning Research 16. 3905–9.Google Scholar

James, Gareth, Witten, Daniela, Hastie, Trevor and Tibshirani, Robert. 2015. An Introduction to Statistical Learning with Applications in R. New York: Springer.Google Scholar

Kuhn, Max. 2008. Building Predictive Models in R Using the Caret Package. Journal of Statistical Software 28(5). 1–26.Google Scholar

Kuhn, Max, and Johnson, Kjell. 2016. Applied Predictive Modeling. New York: Springer.Google Scholar

Kuhn, Max. Contributions from Jed Wing, Steve Weston, Andre Williams, Chris Keefer, Allan Engelhardt, Tony Cooper, Zachary Mayer, Brenton Kenkel, the R Core Team, Michael Benesty, Reynald Lescarbeau, Andrew Ziem, Luca Scrucca, Yuan Tang, Can Candan and Tyler Hunt. 2018. Caret: Classification and regression training [R package version 6.0–81]. https://CRAN.R-project.org/package=caret.Google Scholar

Lange, Claudia. 2012. The Syntax of Spoken Indian English. Amsterdam: John Benjamins.Google Scholar

Maclay, Howard, and Osgood, Charles E.. 1959. Hesitation Phenomena in Spontaneous English Speech. Word 15(1). 19–44.Google Scholar

Mukherjee, Joybrato. 2000. Speech Is Silver, but Silence Is Golden: Some Remarks on the Function(s) of Pauses. Anglia 118(4). 571–84.Google Scholar

Mukherjee, Joybrato. 2010. The Development of English in India. In Kirkpatrick, Andy, ed. The Routledge Handbook of World Englishes. London: Routledge. 167–80.Google Scholar

Oviatt, Sharon. 1995. Predicting Spoken Disfluencies During Human-Computer Interaction. Computer Speech & Language 9(1). 19–35.Google Scholar

Revis, Melanie, and Bernaisch, Tobias. 2020. The Pragmatic Nativisation of Pauses in Asian Englishes. World Englishes 39(1). 135–53.Google Scholar

Rayson, Paul, Leech, Geoffrey N. and Hodges, Mary. 1997. Social Differentiation in the Use of English Vocabulary: Some Analyses of the Conversational Component of the British National Corpus. International Journal of Corpus Linguistics 2(1). 133–52.Google Scholar

Stenström, Anna-Brita. 1990. Pauses in Monologue and Dialogue. In Svartvik, Jan, ed. The London-Lund Corpus of Spoken English: Description and Research. Lund: Lund University Press. 211–52.Google Scholar

Strobl, Carolin, Malley, James D. and Tutz, Gerhard. 2009. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests. Psychological Methods 14(4). 323–48.Google Scholar

Tottie, Gunnel. 2014a. On the Use of Uh and Um in American English. Functions of Language 21(1). 6–29.Google Scholar

Tottie, Gunnel. 2014b. Turn Management and the Fillers Uh and Um. In Aijmer, Karin and Rühlemann, Christoph, eds. Corpus Pragmatics: A Handbook. Cambridge: Cambridge University Press. 381–407.Google Scholar

Venables, William N., and Ripley, Brian D.. 2002. Modern Applied Statistics with S. 4th ed. New York: Springer.Google Scholar

Zuur, Alain F., Ieno, Elena N., Walker, Neil and Saveliev, Anatoly A.. 2009. Mixed Effects Models and Extensions in Ecology with R. Berlin: Springer.Google Scholar

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.

Book contents

6 - Comparing Generalised Linear Mixed-Effects Models, Generalised Linear Mixed-Effects Model Trees and Random Forests

Summary

Keywords

Information

Access options

Book purchase

Temporarily unavailable

References

Further Reading

References

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Save book to Kindle

Save book to Dropbox

Save book to Google Drive