Hostname: page-component-5f745c7db-j9pcf Total loading time: 0 Render date: 2025-01-06T07:29:15.883Z Has data issue: true hasContentIssue false

Multilingualism and AI: The Regimentation of Language in the Age of Digital Capitalism

Published online by Cambridge University Press:  01 January 2025

Britta Schneider*
Affiliation:
Europa-Universität Viadrina, Germany
*
Contact Britta Schneider at Kulturwissenschaftliche Fakultät, Große Scharrnstrasse 59, Frankfurt Oder, Brandenburg 15230, Germany (bschneider@europa-uni.de).

Abstract

This article examines the effects of commercial digital language technologies on the regimentation of language. Language technologies based on the exploitation of large data sets—from machine translation and automatic text generation to digital voice assistants—are a particular form of human-made sign practice in which traditional language norms interact with the affordances of digital devices and the capitalist interests of those who design them. Such sociotechnological practices construct language hierarchies within the realm of commercially based language technology and can shape both dominant discourses about language in society and epistemologies of language in linguistics. The article focuses on interrelationships between digital language technology and metasemiotic interpretations of language that pertain to multilingualism, language variation, and language prestige. It examines languages as discursive constructs and reviews the role of media technology in shaping language ideology, showing that writing and print have had a crucial impact on modern language concepts. It draws on expert discourse and qualitative interviews with programmers and users and examines ideological effects of digital language technology and the potential epistemological reconfigurations of concepts of language that may emerge as a result.

Type
Articles
Copyright
Copyright © 2022 Semiosis Research Centre at Hankuk University of Foreign Studies. All rights reserved.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This article has been inspired by talks and discussions realized in the Ideologies, Beliefs, Attitudes working group within the European Cooperation in Science and Technology (COST) Language in the Human-Machine Era project. I want to thank all members and participants for sharing their thoughts and ideas. Also, I want to thank the editor as well as an anonymous reviewer for constructive feedback. All remaining errors are my own.

References

Agha, Asif. 2003. “The Social Life of Cultural Value.Language and Communication 23: 231–73.10.1016/S0271-5309(03)00012-0CrossRefGoogle Scholar
Anderson, Benedict. 1985. Imagined Communities. London: Verso.Google Scholar
Asscher, Omri, and Ella Glikson. 2021. “Human Evaluations of Machine Translation in an Ethically Charged Situation.” New Media & Society. https://doi.org/10.1177/14614448211018833.CrossRefGoogle Scholar
Barera, Michael. 2020. “Mind the Gap: Addressing Structural Equity and Inclusion on Wikipedia.” University of Texas Arlington, Libraries Research Commons. http://hdl.handle.net/10106/29572.Google Scholar
Bender, Emily, and Alexander Koller. 2020. “Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data.” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–98. https://aclanthology.org/2020.acl-main.463.pdf.10.18653/v1/2020.acl-main.463CrossRefGoogle Scholar
Bender, Emily, Angelina McMillan-Major, Timnit Gebru, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” FAccT ’21, March 3–10 (virtual), Canada. https://doi.org/10.1145/3442188.3445922.CrossRefGoogle Scholar
Blommaert, Jan, James Collins, and Stef Slembrouck. 2005. “Spaces of Multilingualism.Language and Communication 25: 197–216.10.1016/j.langcom.2005.05.002CrossRefGoogle Scholar
Bolukbasi, Tolga, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. 2016. “Man Is to Computer Programmer as Woman Is to Homemaker? Debiasing Word Embeddings.” 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona. https://papers.nips.cc/paper/2016/file/a486cd07e4ac3d270571622f4f316ec5-Paper.pdf.Google Scholar
Bommasani, Rishi, Percy Liang, et al. 2021. “On the Opportunities and Risks of Foundation Models.” Center for Research on Foundation Models, Stanford University. https://arxiv.org/abs/2108.07258.Google Scholar
Bourdieu, Pierre. 1982. Ce que parler veut dire: L’économie des échanges linguistiques. Paris: Fayard.Google Scholar
Cowley, Stephen. 2011. “Distributed Language.” In Distributed Language, edited by Stephen Cowley, 1–14. Amsterdam: John Benjamins.10.1075/bct.34CrossRefGoogle Scholar
Crawford, Kate. 2017. “The Trouble with Bias.” NIPS 2017 keynote address (video). Artificial Intelligence Channel. https://youtu.be/fMym_BKWQzk.Google Scholar
Crawford, Kate. 2021. Atlas of AI. New Haven, CT: Yale University Press.Google Scholar
Devlin, Jacob, and Ming-Wei Chang. 2018. “Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing.” Google AI Blog. https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html.Google Scholar
Dodge, Jesse, Maarten Sap, Ana Marasović, William Agnew, Gabriel Ilharco, Dirk Groeneveld, Margaret Mitchell, and Matt Gardner. 2021. “Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus.” https://arxiv.org/pdf/2104.08758.pdf.10.18653/v1/2021.emnlp-main.98CrossRefGoogle Scholar
Eisenstein, Elizabeth L. 1979. The Printing Press as an Agent of Change: Communications and Cultural Transformations in Early-Modern Europe. Cambridge: Cambridge University Press.Google Scholar
Errington, Joseph. 2008. Linguistics in a Colonial World: A Story of Language, Meaning and Power. Malden, MA: Blackwell.Google Scholar
French, R., and K. T. Kernan. 1981. “Art and Artifice in Belizean Creole.American Ethnologist 8: 238–58.10.1525/ae.1981.8.2.02a00020CrossRefGoogle Scholar
Gal, Susan, and Judith T. Irvine. 2019. Signs of Difference: Language and Ideology in Social Life. Cambridge: Cambridge University Press.10.1017/9781108649209CrossRefGoogle Scholar
Gal, Susan, and Kathryn A. Woolard. 2001. Languages and Publics: The Making of Authority. Manchester: St. Jerome.Google Scholar
Gehman, Samuel, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A. Smith. 2020. “RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models.” Findings of the Association for Computational Linguistics, EMNLP 2020, 3356–69. https://arxiv.org/pdf/2009.11462.pdf.10.18653/v1/2020.findings-emnlp.301CrossRefGoogle Scholar
Gershon, Ilana. 2010. “Media Ideologies: An Introduction.Journal of Linguistic Anthropology 20: 283–93.10.1111/j.1548-1395.2010.01070.xCrossRefGoogle Scholar
Giesecke, Michael. 1991. Der Buchdruck in der frühen Neuzeit. Eine historische Fallstudie über die Durchsetzung neuer Informations- und Kommunikationstechnologien. Frankfurt: Suhrkamp.Google Scholar
Ging, Debbie. 2019. “Alphas, Betas, and Incels: Theorizing the Masculinities of the Manosphere.Men and Masculinities 22: 638–57.10.1177/1097184X17706401CrossRefGoogle Scholar
Gogolin, Ingrid. 1994. Der monolinguale Habitus der multilingualen Schule. Münster: Waxmann.Google Scholar
Gramling, David. 2016. The Invention of Monolingualism. New York: Bloomsbury.Google Scholar
Hao, Karen. 2020a. “OpenAI Is Giving Microsoft Exclusive Access to Its GPT-3 Language Model.” MIT Technology Review. https://www.technologyreview.com/2020/09/23/1008729/openai-is-giving-microsoft-exclusive-access-to-its-gpt-3-language-model/.Google Scholar
Hao, Karen. 2020b. “We Read the Paper That Forced Timnit Gebru out of Google. Here’s What It Says. The Company’s Star Ethics Researcher Highlighted the Risks of Large Language Models, Which Are Key to Google’s Business.” MIT Technology Review. https://www.technologyreview.com/2020/12/04/1013294/google-ai-ethics-research-paper-forced-out-timnit-gebru/.Google Scholar
Heller, Monica, and Alexandre Duchêne. 2012. “Pride and Profit: Changing Discourses of Language, Capital and Nation-State.” In Language and Late Capitalism: Pride and Profit, edited by Monica Heller and Alexandre Duchêne, 1–20. New York: Routledge.Google Scholar
Heller, Monica, and Bonnie McElhinny. 2017. Language, Capitalism, Colonialism. Toronto: University of Toronto Press.Google Scholar
Hepworth, Shelley. 2021. “People Flocked to Language Apps during the Pandemic—but How Much Can They Actually Teach You?” The Guardian. https://www.theguardian.com/technology/commentisfree/2022/jan/01/people-flocked-to-language-apps-during-the-pandemic-but-how-much-can-they-actually-teach-you.Google Scholar
Herring, Susan C. 1996. Computer-Mediated Communication: Linguistic, Social, and Cross-Cultural Perspectives. Amsterdam: John Benjamins.10.1075/pbns.39CrossRefGoogle Scholar
Hewett, Freya, and Sami Nenno. 2021. “How to Identify Bias in Natural Language Processing.” HIIG – Digital Society Blog. https://www.hiig.de/en/bias-in-natural-language-processing/?utm_source=mailpoet&utm_medium=email&utm_campaign=Monthly+Digest+November.Google Scholar
Heyd, Theresa, and Britta Schneider. 2019. “The Sociolinguistics of Late Modern Publics.Journal of Sociolinguistics 23: 435–49.10.1111/josl.12378CrossRefGoogle Scholar
Hootsuite. 2021. “Digital 2021: Global Overview Report.” https://hootsuite.widen.net/s/zcdrtxwczn/digital2021_globalreport_en.Google Scholar
Hopper, Paul. 1998. “Emergent Grammar.” In The New Psychology of Language, edited by Michael Tomasello, 155–75. Mahwah, NJ: Lawrence Erlbaum.Google Scholar
Irvine, Judith T. 2001. “‘Style’ as Distinctiveness: The Culture and Ideology of Linguistic Differentiation.” In Style and Sociolinguistic Variation, edited by Penelope Eckert and John R. Rickford, 21–43. Cambridge: Cambridge University Press.Google Scholar
Irvine, Judith T., and Susan Gal. 2009. “Language Ideology and Linguistic Differentiation.” In Linguistic Anthropology: A Reader, edited by Alessandro Duranti, 402–34. Oxford: Wiley Blackwell.Google Scholar
Jones, Rodney H., Alice Chik, and Christoph A. Hafner. 2015. Discourse and Digital Practices. Milton Park: Routledge.10.4324/9781315726465CrossRefGoogle Scholar
Joseph, John E., and Talbot J. Taylor, eds. 1990. Ideologies of Language. London: Routledge.Google Scholar
Jurafsky, Daniel, and James H. Martin. 2021. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Unpublished manuscript, last modified December 29, 2021. https://web.stanford.edu/~jurafsky/slp3/ed3book_dec292021.pdf.Google Scholar
Koenecke, Allison, Andrew Namb, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toupsc, John R. Rickford, Dan Jurafsky, and Sharad Goeld. 2020. “Racial Disparities in Automated Speech Recognition.Proceedings of the National Academy of Sciences 117: 7684–89. https://www.pnas.org/doi/epdf/10.1073/pnas.1915768117.CrossRefGoogle ScholarPubMed
Kreutzer, Julia, Isaac Caswell, and Lisa Wang. 2021. “Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets.” Transactions of the Association for Computational Linguistics. https://arxiv.org/pdf/2103.12028.pdf.Google Scholar
Kroskrity, Paul V. 2000. “Regimenting Languages: Language Ideological Perspectives.” In Regimes of language. Ideologies, polities, and identities, edited by Paul V. Kroskrity, 1–34. Santa Fe: School of American Research Press.Google Scholar
Lauscher, Anne, Vinit Ravishankar, Ivan Vulić, and Goran Glavaš. 2020. “From Zero to Hero, on the Limitations of Zero-Shot Language Transfer with Multilingual Transformers.” Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 4483–99. https://aclanthology.org/2020.emnlp-main.363.pdf.10.18653/v1/2020.emnlp-main.363CrossRefGoogle Scholar
Leblebici, Didem. 2021. “Language Ideologies in Human-Machine Interaction. A Qualitative Study with Voice Assistant Users.” MA thesis, Europa-Universität Viadrina.Google Scholar
Lind, Miriam. 2021.“‘Alexa, 3, Sprachassistentin, hat die Religion für sich entdeckt’—Eine framesemantische Korpusstudie zur Anthropomorphisierung von Sprachassistenten.” In Mensch – Tier – Maschine. Sprachliche Praktiken an und jenseits der Außengrenze des Humanen, edited by Miriam Lind, 347–70. Bielefeld: transcript.Google Scholar
Linell, Per. 2005. The Written Language Bias in Linguistics: Its Nature, Origins and Transformations. London: Routledge.Google Scholar
Lippi-Green, Rosina. 2012. English with an Accent. New York: Routledge.10.4324/9780203348802CrossRefGoogle Scholar
Love, Nigel. 2017. “On Languaging and Languages.Language Sciences 61: 113–47.10.1016/j.langsci.2017.04.001CrossRefGoogle Scholar
Madsen, Lian Malai, Martha Sif Karrebæk, and Janus Spindler Møller, eds. 2016. Everyday Languaging: Collaborative Research on the Language Use of Children and Youth. Berlin: de Gruyter.Google Scholar
McShane, Marjorie, and Sergei Nirenburg. 2021. Linguistics for the Age of AI. Cambridge, MA: MIT Press.10.7551/mitpress/13618.001.0001CrossRefGoogle Scholar
Mitchell, Margaret. 2021. “Cementing a Foundation of Inequality in AI.” Presentation, Workshop on Foundation Models, Stanford University. https://crfm.stanford.edu/workshop.html.Google Scholar
Nayak, Pandu. 2019. “Understanding Searches Better than Ever Before.” Google Blog. https://blog.google/products/search/search-language-understanding-bert/.Google Scholar
NLLB Team et al. 2022. “No Language Left Behind: Scaling Human-Centered Machine Translation.” https://arxiv.org/abs/2207.04672.Google Scholar
Ong, Walter J. 1982. Orality and Literacy: The Technologizing of the Word. London: Routledge.10.4324/9780203328064CrossRefGoogle Scholar
Pakalski, Ingo. 2009. “Linguee: Suchmaschine für Übersetzungen.” Golem. https://www.golem.de/0904/66396.html.Google Scholar
Pennycook, Alastair. 2004. “Performativity and Language Studies.Critical Inquiry in Language Studies 1: 1–19.10.1207/s15427595cils0101_1CrossRefGoogle Scholar
Pennycook, Alastair. 2018. Posthumanist Applied Linguistics. London: Routledge.Google Scholar
Porcheron, Martin, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. 2018. “Voice Interfaces in Everyday Life.” Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, April 2018. https://www.semanticscholar.org/paper/Voice-Interfaces-in-Everyday-Life-Porcheron-Fischer/4ace92c22a895d5e23e58de8d738df8e500d8d79.Google Scholar
Randhawa, Gurdeeshpal, Mariella Ferreyra, Rukhsana Ahmed, Omar Ezzat, and Kevin Pottie. 2013. “Using Machine Translation in Clinical Practice.Canadian Family Physician 59: 382–83.Google ScholarPubMed
Savoldi, Beatrice, Marco Gaido, Luisa Bentivogli, Matteo Negri, and Marco Turchi. 2021. “Gender Bias in Machine Translation.Transactions of the Association for Computational Linguistics 9: 845–74. https://aclanthology.org/2021.tacl-1.51.pdf.10.1162/tacl_a_00401CrossRefGoogle Scholar
Schneider, Britta. 2019. “Methodological Nationalism in Linguistics.Language Sciences 76: 101–69.10.1016/j.langsci.2018.05.006CrossRefGoogle Scholar
Schneider, Britta. 2021a. “Creole Prestige beyond Modernism and Methodological Nationalism: Multiplex Patterns, Simultaneity and Non-closure in the Sociolinguistic Economy of Belize.Journal of Pidgin and Creole Languages 36: 12–45.10.1075/jpcl.00068.schCrossRefGoogle Scholar
Schneider, Britta. 2021b. “Von Gutenberg zu Alexa—Posthumanistische Perspektiven auf Sprachideologie.” In Mensch – Tier – Maschine, edited by Miriam Schmidt-Jüngst, 327–346. Bielefeld: Transcript.10.14361/9783839453131-014CrossRefGoogle Scholar
Schneider, Britta. Forthcoming. Liquid Languages—Constructing Language in Late Modern Cultures of Diffusion. Cambridge: Cambridge University Press.Google Scholar
Stefanowitsch, Anatol. 2020. Corpus Linguistics: A Guide to the Methodology. Berlin: Language Science Press.Google Scholar
Sun, Tony, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai El Sherief, Jieyu Zhao, Diba Mirza, Elizabeth Belding, Kai-Wei Chang, and William Yang Wang. 2019. “Mitigating Gender Bias in Natural Language Processing: Literature Review.” Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1630–40. https://www.aclweb.org/anthology/P19-1159.pdf 10.18653/v1/P19-1159CrossRefGoogle Scholar
Tatman, Rachael. 2016. “Google’s Speech Recognition Has a Gender Bias.” Making Noise and Hearing Things. https://makingnoiseandhearingthings.com/2016/07/12/googles-speech-recognition-has-a-gender-bias/.Google Scholar
UNESCO. 2019. “I’d Blush If I Could: Closing Gender Divides in Digital Skills through Education.” https://unesdoc.unesco.org/ark:/48223/pf0000367416.page=1.Google Scholar
Vanmassenhove, Eva, Dimitar Shterionov, and Andy Way. 2019. “Lost in Translation: Loss and Decay of Linguistic Richness in Machine Translation.Proceedings of MT Summit XVII 1: 222–32. https://arxiv.org/pdf/1906.12068.pdf.Google Scholar
Virtanen, Antti, Jenna Kanerva, Rami Ilo, Jouni Luoma, Juhani Luotolahti, Tapio Salakoski, Filip Ginter, and Sampo Pyysalo. 2019. “Multilingual Is Not Enough: BERT for Finnish.” https://arxiv.org/pdf/1912.07076.pdf.Google Scholar
Wimmer, Andreas, and Nina Glick Schiller. 2002. “Methodological Nationalism and Beyond: Nation-State Building, Migration and the Social Sciences.Global Networks 2 (4): 301–34.10.1111/1471-0374.00043CrossRefGoogle Scholar
Woolard, Kathryn A. 1998. “Introduction: Language Ideology as Field of Inquiry.” In Language Ideologies: Practice and Theory, edited by Bambi B. Schieffelin, Kathryn A. Woolard, and Paul V. Kroskrity, 3–47. Oxford: Oxford University Press.10.1093/oso/9780195105612.003.0001CrossRefGoogle Scholar
Xu, Albert, Eshaan Pathak, Eric Wallace, Suchin Gururangan, Maarten Sap, and Dan Klein. 2021. “Detoxifying Language Models Risks Marginalizing Minority Voices.” Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics, Human Language Technologies, 2390–97. https://doi.org/10.18653/v1/2021.naacl-main.190.CrossRefGoogle Scholar
Zhao, Jieyu, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2017. “Men Also Like Shopping: Reducing Gender Bias Amplification Using Corpus-Level Constraints.” Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2979–89. https://arxiv.org/pdf/1707.09457.pdf.10.18653/v1/D17-1323CrossRefGoogle Scholar
Zijlstra, Judith, and Ilse van Liempt. 2017. “Smart(phone) Travelling: Understanding the Use and Impact of Mobile Technology on Irregular Migration Journeys.International Journal of Migration and Border Studies 3: 174–91.10.1504/IJMBS.2017.083245CrossRefGoogle Scholar
Zweig, Katharina. 2019. Ein Algorithmus hat kein Taktgefühl. Munich: Heyne.Google Scholar