Constrained language use in Finnish: A corpus-driven approach

Ilmari Ivaska; Silvia Bernardini

doi:10.1017/S0332586520000013

Constrained language use in Finnish: A corpus-driven approach

Published online by Cambridge University Press: 13 April 2020

Ilmari Ivaska and

Silvia Bernardini

Show author details

Ilmari Ivaska*: Affiliation:
Department of Finnish and Finno-Ugric Languages, FI-20014, University of Turku, Finland
Silvia Bernardini*: Affiliation:
Department of Interpreting and Translation, University of Bologna, Corso della Repubblica 136, 47121Forlì (FC), Italy
*: Emails for correspondence: ilmari.ivaska@utu.fi and silvia.bernardini@unibo.it
Emails for correspondence: ilmari.ivaska@utu.fi and silvia.bernardini@unibo.it

Article contents

Abstract
References

Get access

Abstract

It has been suggested that second languages and translated languages are constrained by an interplay of several linguistic systems. This paper reports on a data-driven quantitative study on constrained Finnish. We detect linguistic phenomena that distinguish constrained from non-constrained Finnish across constrained varieties, first/source languages, and registers. Implementing a two-phase method, we first detect key quantitative differences of syntactically defined POS bigrams between each variety-, language-pair- and register-specific constrained dataset and its non-constrained counterpart, using Boruta feature selection. We then use the results as variables in a Multi-dimensional Analysis. The results show that both nominal complexity and verbal/clausal complexity distinguish constrained from non-constrained Finnish. These differences interact with both type of constraint and register: the constrained varieties are less sensitive to register differences, and this tendency is more pronounced in learner Finnish than in translated Finnish. Leaving out any of these variables from the analysis would blur our view of this multi-faceted phenomenon.

Keywords

constrained language use corpus-based variationist linguistics data-driven approach Finnish keyness analysis Multi-dimensional Analysis second language acquisition translation studies

Information

Type: Research Article
Information: Nordic Journal of Linguistics , Volume 43 , Issue 1 , May 2020 , pp. 33 - 57

DOI: https://doi.org/10.1017/S0332586520000013 [Opens in a new window]
Copyright: © Cambridge University Press 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Baker, Mona. 1993. Corpus linguistics and translation studies: Implications and applications. In Baker, Mona, Francis, Gill & Tognini-Bonelli, Elena (eds.), Text and Technology: In Honour of John Sinclair, 233–250. Amsterdam: John Benjamins.CrossRef Google Scholar

Baker, Mona. 1996. Corpus-based translation studies: The challenges that lie ahead. In Somers, Harold (ed.), Terminology, LSP and Translation: Studies in Language Engineering in Honour of Juan C. Sager, 175–187. Amsterdam: John Benjamins.CrossRef Google Scholar

Baroni, Marco & Bernardini, Silvia. 2006. A new approach to the study of translationese: Machine-learning the difference between original and translated text. Literary and Linguistic Computing 21(3), 259–274.CrossRef Google Scholar

Becher, Viktor. 2010. Abandoning the notion of “translation-inherent” explicitation: Against a dogma of translation studies. Across Languages and Cultures 11(1), 1–28.CrossRef Google Scholar

Berber Sardinha, Tony & Pinto, Marcia Veirano (eds.). 2014. Multi-dimensional Analysis, 25 Years On: A Tribute to Douglas Biber. Amsterdam: John Benjamins.CrossRef Google Scholar

Biber, Douglas. 1988. Variation across Speech and Writing. Cambridge: Cambridge University Press.CrossRef Google Scholar

Biber, Douglas. 1989. A typology of English texts. Linguistics 27(1), 3–43.CrossRef Google Scholar

Biber, Douglas. 2014. Using Multi-dimensional Analysis to explore cross-linguistic universals of register variation. Languages in Contrast 14(1), 7–34.Google Scholar

Biber, Douglas & Conrad, Susan. 2009. Register, Genre, and Style. Cambridge: Cambridge University Press.CrossRef Google Scholar

Biber, Douglas, Gray, Bethany & Staples, Shelley. 2016. Predicting patterns of grammatical complexity across Language Exam Task types and proficiency levels. Applied Linguistics 37(5), 639–668.CrossRef Google Scholar

Bohnet, Bernd, Nivre, Joakim, Boguslavsky, Igor, Farkas, Richárd, Ginter, Filip & Hajič, Jan. 2013. Joint morphological and syntactic analysis for richly inflected languages. Transactions of the Association for Computational Linguistics 1, 415–428.CrossRef Google Scholar

Breiman, Leo. 2001. Random forests. Machine Learning 45(1), 5–32.CrossRef Google Scholar

Bulté, Bram & Housen, Alex. 2012. Defining and operationalising L2 complexity. In Housen, Alex, Kuiken, Folkert & Vedder, Ineke (eds.), Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA, 21–46. Amsterdam: John Benjamins.CrossRef Google Scholar

Eskola, Sari. 2004. Untypical frequencies in translated language: A corpus-based study on a literary corpus of translated and non-translated Finnish. In Mauranen & Kujamäki (eds.), 83–99.Google Scholar

Filipović, Luna & Hawkins, John A. 2013. Multiple factors in second language acquisition: The CASP model. Linguistics 51(1), 145–176.CrossRef Google Scholar

Gabrielatos, Costas. 2018. Keyness analysis: Nature, metrics and techniques. In Taylor, Charlotte & Marchi, Anna (eds.), Corpus Approaches to Discourse: A Critical Review, 225–258. Oxford: Routledge.CrossRef Google Scholar

Granger, Sylviane. 2015. Contrastive interlanguage analysis: A reappraisal. International Journal of Learner Corpus Research 1(1), 7–24.CrossRef Google Scholar

Gries, Stefan Th. On classification trees and random forests in corpus linguistics: Some words of caution and suggestions for improvement. Corpus Linguistics and Linguistic Theory, DOI: https://doi.org/10.1515/cllt-2018-0078. Published by de Gruyter, 16 April 2019.Google Scholar

Grosjean, François. 2001. The bilingual’s language modes. In Nicol, Janet (ed.), One Mind, Two Languages: Bilingual Language Processing, 1–22. Oxford: Blackwell.Google Scholar

House, Juliane. 2008. Beyond intervention: Universals in translation? trans-kom 1(1), 6–19.Google Scholar

Ivaska, Ilmari. 2014a. The Corpus of Advanced Learner Finnish (LAS2): Database and toolkit to study academic learner Finnish. Apples: Journal of Applied Language Studies 8(3), 21–38.Google Scholar

Ivaska, Ilmari. 2014b. Edistyneen oppijansuomen avainrakenteita. Korpusnäkökulma kahden kielimuodon tyypillisiin rakenteellisiin eroihin [Key structures in advanced learner Finnish: Corpus approaches towards structural differences between two language forms]. Virittäjä 118(2), 161–193.Google Scholar

Ivaska, Ilmari. 2014c. Mahdollisuuden ilmaiseminen S1-suomea ja edistynyttä S2-suomea erottavana piirteenä [Expressions of possibility as a distringuishing feature between L1-Finnish and advanced L2-Finnish]. Lähivõrdlusi. Lähivertailuja 24, 47–80.CrossRef Google Scholar

Ivaska, Ilmari. 2015. Longitudinal changes in academic learner Finnish: A key structure analysis. International Journal of Learner Corpus Research 1(2), 210–241.CrossRef Google Scholar

Ivaska, Ilmari, Reunanen, Elisa & Siitonen, Kirsti. 2016. Infinite Konstruktionen im fortgeschrittenen Finnisch als Fremdsprache [Infinitive constructions in advanced Finnish as a foreign language]. Ural-Altaische Jahrbücher 26, 46–76.Google Scholar

Ivaska, Ilmari & Siitonen, Kirsti. 2017a. Learner language morphology as a window to crosslinguistic influences: A key structure analysis. Nordic Journal of Linguistics 40(2), 225–253.CrossRef Google Scholar

Ivaska, Ilmari & Siitonen, Kirsti. 2017b. Tehdessä-konstruktio edistyneessä oppijansuomessa. Korpusanalyysin ja oikeakielisyysarviointien ristivalotus [The tehdessä construction in advanced learner Finnish]. Sananjalka 59, 154–180.Google Scholar

Ivaska, Laura. 2019. Distinguishing translations from non-translations and identifying (in-)direct translations’ source languages. In Jantunen, Jarmo, Brunni, Sisko, Kunnas, Niina, Palviainen, Santeri & Västi, Katja (eds.), Proceedings of the Research Data and Humanities (RDHum) 2019 Conference: Data, Methods and Tools, Oulu, 125–138.Google Scholar

Iwasaki, Shoichi. 2015. A multiple-grammar model of speakers’ linguistic knowledge. Cognitive Linguistics 26(2), 161–210.CrossRef Google Scholar

Jantunen, Jarmo. 2004. Untypical patterns in translations. In Mauranen & Kujamäki (eds.), 101–126.Google Scholar

Jantunen, Jarmo. 2008. Haasteita oppijankielen korpusanalyysille: oppijankielen universaalit [Challenges in the learner corpus analysis: The universals of learner language]. In Eslon, Pille (ed.), Õppijakeele analüüs: võimalused, probleemid, vajadused [Analysing learner language: Opportunities, problems, needs], 67–92. Tallinn: Tallinna Ülikool.Google Scholar

Jantunen, Jarmo. 2011a. Kansainvälinen oppijansuomen korpus (ICLFI): typologia, taustamuuttujat ja annotointi [International Corpus of Learner Finnish (ICLFI): Typology, variables and annotation]. Lähivõrdlusi. Lähivertailuja 21, 86–105.CrossRef Google Scholar

Jantunen, Jarmo. 2011b. Avainsana-analyysi annotoidun oppijankieliaineiston tutkimuksessa: Alustavia havaintoja [Keyword analysis in the study of annotated learner language data: Preliminary observations]. In Lehtinen, Esa, Aaltonen, Sirkku, Koskela, Merja, Nevasaari, Elina & Skog-Södersved, Mariann (eds.), AFinla-e 3, 48–61.Google Scholar

Jantunen, Jarmo & Eskola, Sari. 2002. Käännössuomi kielivarianttina: Syntaktisia ja leksikaalisia erityispiirteitä [Translated Finnish as a language variant: Untypical syntactical and lexical features]. Virittäjä 106(2), 184–207.Google Scholar

Jarvis, Scott. 2000. Methodological rigor in the study of transfer: Identifying L1 influence in the interlanguage lexicon. Language Learning 50(2), 245–309.CrossRef Google Scholar

Jarvis, Scott. 2010. Comparison-based and detection-based approaches to transfer research. EUROSLA Yearbook 10, 169–192.CrossRef Google Scholar

Kaiser, Henry F. 1974. An index of factorial simplicity. Psychometrika 39(1), 31–36.CrossRef Google Scholar

Kanerva, Jenna, Ginter, Filip, Miekka, Niko, Leino, Akseli & Salakoski, Tapio. 2018. Turku Neural Parser Pipeline: An end-to-end system for the CoNLL 2018 Shared Task. Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Brussels: ACL.Google Scholar

Kolehmainen, Leena, Meriläinen, Lea & Riionheimo, Helka. 2014. Interlingual reduction: Evidence from language contacts, translation and second language acquisition. In Paulasto, Heli, Meriläinen, Lea, Riionheimo, Helka & Kok, Maria (eds.), Language Contacts at the Crossroads of Disciplines, 3–32. Cambridge: Cambridge Scholars Publishing.Google Scholar

Koppel, Moshe & Ordan, Noam. 2011. Translationese and its dialects. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, 1318–1326. Portland, OR: ACL.Google Scholar

Kruger, Haidee. 2017. The effects of editorial intervention: Implications for studies of the features of translated language. In De Sutter, Gert, Lefer, Marie-Aude & Delaere, Isabelle (eds.), Empirical Translation Studies: New Methodological and Theoretical Traditions, 113–155. Berlin: de Gruyter.Google Scholar

Kruger, Haidee & van Rooy, Bertus. 2016. Constrained language: A multidimensional analysis of translated English and a non-native indigenised variety of English. English World-Wide 37(1), 26–57.CrossRef Google Scholar

Kruger, Haidee & van Rooy, Bertus. 2018. Register variation in written contact varieties of English. English World-Wide 39(2), 214–242.CrossRef Google Scholar

Kujamäki, Pekka. 2004. What happens to “unique items” in learners’ translations? In Mauranen & Kujamäki (eds.), 187–204.Google Scholar

Kursa, Miron & Rudnicki, Witold. 2010. Feature selection with the Boruta Package. Journal of Statistical Software, Articles 36(11), 1–13.Google Scholar

Lanstyák, Istvan & Heltai, Pál. 2012. Universals in language contact and translation. Across Languages and Cultures 13(1), 99–121.CrossRef Google Scholar

Leech, Geoffrey. 2006. New resources, or just better old ones? The Holy Grail of representativeness. In Nesselhauf, Nadja & Biewer, Carolin (eds.), Corpus Linguistics and the Web, 133–149. London: Brill.Google Scholar

Lefer, Marie-Aude & Vogeleer, Svetlana. 2013. Interference and normalization in genre-controlled multilingual corpora: Introduction. Belgian Journal of Linguistics 27(1), 1–21.Google Scholar

Mauranen, Anna. 2000. Strange strings in translated language: A study on corpora. In Olohan, Maeve (ed.), Intercultural Faultlines: Research Models in Translation Studies, 119–141. Manchester: St Jerome Publishing.Google Scholar

Mauranen, Anna. 2004. Corpora, universals and interference. In Mauranen & Kujamäki (eds.), 65–82.Google Scholar

Mauranen, Anna & Kujamäki, Pekka (eds.). 2004. Translation Universals: Do they Exist? Amsterdam: John Benjamins.CrossRef Google Scholar

Mauranen, Anna & Tiittula, Liisa. 2005. MINÄ käännössuomessa ja supisuomessa [MINÄ ’I’ in the translated and non-translated Finnish]. In Mauranen, Anna & Jantunen, Jarmo (eds.), Käännössuomeksi. Tutkimuksia suomennosten kielestä [In translated Finnish: Studies on the language of Finnish translations], 35–69. Tampere: Tampere University Press.Google Scholar

Miestamo, Matti. 2006. On the feasibility of complexity metrics. In Kerge, Krista & Sepper, Maria-Maren (eds.), FinEst Linguistics, Proceedings of the Annual Finnish and Estonian Conference of Linguistics, Tallinn, May 6–7, 2004, 11–26. Tallinn: Tallinna Ülikool.Google Scholar

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. https://www.R-project.org/.Google Scholar

Rabinovich, Ella, Nisioi, Sergu, Ordan, Noam & Wintner, Shuly. 2016. On the similarities between native, non-native and translated texts. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 1870–1881. Berlin: ACL.CrossRef Google Scholar

Revelle, William. 2018. psych: Procedures for Psychological, Psychometric, and Personality Research. Evanston, IL: Northwestern University. https://CRAN.R-project.org/package=psych.Google Scholar

Rohdenburg, Günther. 1996. Cognitive complexity and increased grammatical explicitness in English. Cognitive Linguistics 7(2), 149–182.CrossRef Google Scholar

Seilonen, Marja. 2013. Epäsuora henkilöön viittaminen oppijansuomessa [Indirect references in Finnish learner language]. Ph.D. thesis, University of Jyväskylä.Google Scholar

Spoelman, Marianne. 2013. Prior linguistic knowledge matters: the use of the partitive case in Finnish learner language. Ph.D. thesis, University of Oulu.Google Scholar

Szmrecsanyi, Benedikt. 2017. Variationist sociolinguistics and corpus-based variationist linguistics: Overlap and cross-pollination potential. Canadian Journal of Linguistics/Revue canadienne de linguistique 62(4), 685–701.CrossRef Google Scholar

Szymor, Nina. 2018. Translation: Universals or cognition? A usage-based perspective. Target 30(1), 53–86.CrossRef Google Scholar

Teitto, Heli. 2010. Human referents in subtitles: A study on personal pronouns and proper nouns in translated and original Finnish. MA thesis, University of Eastern Finland.Google Scholar

Tirkkonen-Condit, Sonja. 2004. Unique items: Over- or under-represented in translated language? In Mauranen & Kujamäki (eds.), 177–184.Google Scholar

Tirkkonen-Condit, Sonja. 2005. Häviävätkö uniikkiainekset käännössuomesta? [Do unique items disappear from translated Finnish?]. In Mauranen, Anna & Jantunen, Jarmo (eds.), Käännössuomeksi. Tutkimuksia suomennosten kielestä [In translated Finnish: Studies on the language of Finnish translations], 12–137. Tampere: Tampere University Press.Google Scholar

Toury, Gideon. 2012. Descriptive Translation Studies – and beyond: Revised edition. Amsterdam: John Benjamins.CrossRef Google Scholar

VISK = Hakulinen, Auli, Vilkuna, Maria, Korhonen, Riitta, Koivisto, Vesa, Heinonen, Tarja Riitta & Alho, Irja, 2004: Iso suomen kielioppi [The great grammar of Finnish]. Helsinki: Suomalaisen Kirjallisuuden Seura. http://scripta.kotus.fi/visk (accessed 24 November 2019).Google Scholar

Volansky, Vered, Ordan, Noam & Wintner, Shuly. 2013. On the features of translationese. Digital Scholarship in the Humanities 30(1), 98–118.CrossRef Google Scholar

Article contents

Constrained language use in Finnish: A corpus-driven approach

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests