Comparing Approaches to (Sub-)Register Variation

doi:10.1017/9781108589314.004

3 - Comparing Approaches to (Sub-)Register Variation

The ‘Press Editorials’ Sections in the British, Canadian and Jamaican Components of ICE

from Part II - Selection, Calibration and Preparation of Corpus Data

Published online by Cambridge University Press: 06 May 2022

Fabian Vetter

Edited by

Ole Schützler and

Julia Schlüter

Show author details

Ole Schützler: Affiliation:
Universität Leipzig
Julia Schlüter: Affiliation:
Universität Bamberg

Book contents

Get access

Summary

Two methods are applied to detect differences between corpus (sub )registers, exemplified by the press editorials sections in the British, Canadian and Jamaican components of the International Corpus of English. By design, these methods are apt to target differences between varieties that are represented by putatively comparable corpus material, but it turns out that many of the observed differences can in fact be laid at the door of different sampling strategies applied by corpus compilers. In the example at hand, contrasts can be traced back to the division into institutional and personal editorials. This finding gives rise to a call for a higher granularity of sampling schemes, richer metadata (e.g. on the situational characteristics of the language samples included), and better documentation. As for the methods chosen, the author demonstrates that corpus-driven profiling based either on POS monograms or on higher-level multi-dimensional analysis performs reasonably well, with smaller differences in robustness and computational expense.

Keywords

ICE comparability register variation sampling text clustering parts-of-speech linguistic tagging situational characteristics

Information

Type: Chapter
Information: Data and Methods in Corpus Linguistics
Comparative Approaches
, pp. 75 - 100

DOI: https://doi.org/10.1017/9781108589314.004 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Aggarwal, Charu C. 2018. Machine Learning for Text. Cham: Springer.Google Scholar

Alonso Belmonte, Maria Isabel. 2007. Newspaper Editorials and Comment Articles: A “Cinderella” Genre? Revista Electrónica de Lingüística Aplicada 1–9.Google Scholar

Anthony, Laurence. 2018. AntConc. Tokyo: Waseda University.Google Scholar

Bell, Allan. 1991. The Language of News Media. Language in Society. 1st ed. Oxford: Blackwell.Google Scholar

Biber, Douglas. 1988. Variation Across Speech and Writing. 1st ed. Cambridge: Cambridge University Press.CrossRef Google Scholar

Biber, Douglas, and Conrad, Susan. 2009. Register, Genre, and Style. Cambridge Textbooks in Linguistics. 1st ed. Cambridge: Cambridge University Press.CrossRef Google Scholar

Bonyadi, Alireza. 2011. Linguistic Manifestations of Modality in Newspaper Editorials. International Journal of Linguistics 3(1) 16 pages.Google Scholar

Cavnar, William, and Trenkle, John. 1994. N-Gram-Based Text Categorization. Proceeding of the Third Annual Symposium on Document Analysis and Information Retrieval. Reno, NV: Information Science Research Institute, University of Nevada. 161–77.Google Scholar

Cotter, Colleen. 2010. News Talk: Investigating the Language of Journalism. Cambridge: Cambridge University Press.Google Scholar

Fang, Alex C. 1996. AUTASYS: Grammatical Tagging and Cross-Tagset Mapping. In Greenbaum, Sidney, ed. Comparing English Worldwide: The International Corpus of English. Oxford: Clarendon Press. 110–24.Google Scholar

Fang, Alex C., and Cao, Jing. 2015. Text Genres and Registers: The Computation of Linguistic Features. Heidelberg: Springer.CrossRef Google Scholar

Fartousi, Hassan, and Dumanig, Francisco P.. 2012. Rhetoric of Daily Editorials: A Review Study of Selected Rhetorical Analyses on Daily Editorials. Advances in Asian Social Science 2(1). 373–6.Google Scholar

Garside, Roger, and Smith, Nicholas. 1997. A Hybrid Grammatical Tagger: CLAWS4. In Roger Garside, Geoffrey N. Leech and McEnery, Tony, eds. Corpus Annotation: Linguistic Information from Computer Text Corpora. London: Longman. 102–21.CrossRef Google Scholar

Greenbaum, Sidney. 1996. Introducing ICE. In Greenbaum, Sidney, ed. Comparing English Worldwide: The International Corpus of English. Oxford: Clarendon Press. 3–13.CrossRef Google Scholar

Gries, Stefan T., Newman, John and Shaoul, Cyrus. 2011. N-Grams and the Clustering of Registers. Empirical Language Research 5(1).Google Scholar

Grieve, Jack. 2014. A Comparison of Statistical Methods for the Aggregation of Regional Linguistic Variation. In Szmrecsanyi, Benedikt and Wälchli, Bernhard, eds. Aggregating Dialectology, Typology, and Register Analysis: Linguistic Variation in Text and Speech. Linguae and Litterae 28. Berlin: Mouton de Gruyter. 53–88.Google Scholar

Hundt, Marianne. 2015. World Englishes. In Biber, Douglas and Reppen, Randi, eds. The Cambridge Handbook of English Corpus Linguistics. Cambridge: Cambridge University Press. 381–400.Google Scholar

James, Gareth, Witten, Daniela, Hastie, Trevor and Tibshirani, Robert. 2013. An Introduction to Statistical Learning. Springer Texts in Statistics 103. Heidelberg: Springer.CrossRef Google Scholar

Kirk, John, and Nelson, Gerald. 2017. Review of the ICE Project 2016/17. Paper presented at ICAME38, Prague, 25 May.Google Scholar

Kirk, John, and Nelson, Gerald. 2018. The International Corpus of English Project: A Progress Report. World Englishes 37(4). 697–716.Google Scholar

Liaw, Andy, and Wiener, Matthew. 2002. Classification and Regression by randomForest. R News 2(3). 18–22.Google Scholar

Ljung, Magnus. 2000. Newspaper Genres and Newspaper English. In Ungerer, Friedrich, ed. English Media Texts, Past and Present: Language and Textual Structure Pragmatics and Beyond 80. Amsterdam: John Benjamins. 129–214.Google Scholar

McNair, Brian. 2009. I, Columnist. In Franklin, Bob, ed. Pulling Newspapers Apart: Analysing Print Journalism. 1st ed. London: Routledge. 112–20.Google Scholar

Moisl, Hermann L. 2015. Cluster Analysis for Corpus Linguistics. Quantitative Linguistics 66. Berlin: Mouton de Gruyter.CrossRef Google Scholar

Morley, John, and Murphy, Amanda. 2011. The Peroration Revisited. In Bhatia, Vijay K. and Gotti, Maurizio, eds. Explorations in Specialized Genres. Bern: Lang. 199–216.Google Scholar

Müller, Horst. 2011. Journalistisches Arbeiten: Journalistische Grundlagen Journalistische Arbeitstechniken Journalistische Darstellungsformen. Reihe Mediengestützte Wissensvermittlung 5. 1st ed. Mittweida: Hochschulverlag.Google Scholar

Nelson, Gerald. 1996. The Design of the Corpus. In Greenbaum, Sidney, ed. Comparing English Worldwide: The International Corpus of English. Oxford: Clarendon Press. 27–35.CrossRef Google Scholar

Nelson, Gerald, Wallis, Sean and Aarts, Bas. 2002. Exploring Natural Language: Working with the British Component of the International Corpus of English. Varieties of English Around the World. Amsterdam: John Benjamins.CrossRef Google Scholar

Nini, Andrea. 2014. Multidimensional Analysis Tagger. https://sites.google.com/site/multidimensionaltagger/home.Google Scholar

Petrenz, Philipp, and Webber, Bonnie. 2011. Stable Classification of Text Genres. Computational Linguistics 37(2). 385–93.CrossRef Google Scholar

R Development Core Team. 2008. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.Google Scholar

Reeves, Ian, and Keeble, Richard. 2014. The Newspapers Handbook. Media practice. 5th ed. London: Routledge.Google Scholar

Richardson, John. 2009. Readers’ Letters. In Franklin, Bob, ed. Pulling Newspapers Apart: Analysing Print Journalism. 1st ed. London: Routledge. 58–69.Google Scholar

Rüdiger, Sofia. 2016. Cuppa Coffee? Challenges and Opportunities of Compiling a Conversational English Corpus in an Expanding Circle Setting. In Christ, Hanna, Klenovšak, Daniel, Sönning, Lukas and Werner, Valentin, eds. A Blend of MaLT: Selected Contributions from the Methods and Linguistic Theories Symposium 2015. Bamberger Beiträge zur Linguistik Band 15. Bamberg: University of Bamberg Press. 49–71.Google Scholar

Santini, Marina. 2004. A Shallow Approach to Syntactic Feature Extraction for Genre Classification. In Lee, Mark, ed. Proceedings of the 7th Annual CLUK Research Colloquium: 6th & 7th January 2004, University of Birmingham. Birmingham: University of Birmingham, School of Computer Science.Google Scholar

Sigley, Robert. 2012. Assessing Corpus Comparability Using a Formality Index: The Case of the Brown/LOB Clones. In Yamazaki, Shunji, Sigley, Robert and Saito, Toshio, eds. Approaching Language Variation through Corpora: A Festschrift in Honour of Toshio Saito. Linguistic Insights. Bern: Lang. 65–114.Google Scholar

Straßner, Erich. 2000. Journalistische Texte. Grundlagen der Medienkommunikation 10. Berlin: Mouton de Gruyter.Google Scholar

Tang, Xiaoyan, and Cao, Jing. 2015. Automatic Genre Classification via N-grams of Part-of-Speech Tags. Procedia – Social and Behavioral Sciences 198. 474–8.Google Scholar

Thompson, Geoff. 2014. Intersubjectivity in Newspaper Editorials: Construing the Reader-in-the-Text. In van de Velde, Freek, Brems, Lieselotte and Ghesquière, Lobke, eds. Intersubjectivity and Intersubjectification in Grammar and Discourse: Theoretical and Descriptive Advances. Benjamins Current Topics 65. Amsterdam: John Benjamins. 77–100.Google Scholar

Vetter, Fabian. 2021. Issues of Corpus Comparability and Register Variation in the International Corpus of English: Theories and Computer Applications. PhD Dissertation, University of Bamberg. doi: https://doi.org/10.20378/irb-52406.CrossRef Google Scholar

Vetter, Fabian. 2022. ICEtree. https://osf.io/ztfsx/.Google Scholar

Wahl-Jorgensen, Karin. 2009. Op-ed Pages. In Franklin, Bob, ed. Pulling Newspapers Apart: Analysing Print Journalism. 1st ed. London: Routledge. 70–8.Google Scholar

Werner, Valentin. 2014. The Present Perfect in World Englishes: Charting Unity and Diversity. Bamberger Beiträge zur Linguistik 5. Bamberg: University of Bamberg Press.Google Scholar

Westin, Ingrid. 2002. Language Change in English Newspaper Editorials. Language and Computers 44. Amsterdam: Rodopi.CrossRef Google Scholar

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.

Book contents

3 - Comparing Approaches to (Sub-)Register Variation

Summary

Keywords

Information

Access options

Book purchase

Temporarily unavailable

References

Further Reading

References

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Save book to Kindle

Save book to Dropbox

Save book to Google Drive