Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-01-27T06:49:09.500Z Has data issue: false hasContentIssue false

How hybrid is blog data? A comparison between speech, writing and blog data in Swedish

Published online by Cambridge University Press:  24 October 2018

Maria Wiktorsson*
Affiliation:
Maria Wiktorsson, Malmö universitet, Kultur och samhälle K3, 205 06 Malmö, Sweden. maria.wiktorsson@mau.se
Get access

Abstract

The new forms of written online communication offer a great resource for researchers interested in language variation and use, but more large-scale systematic research into the nature of the data is needed. For instance, Swedish blog data is often described as more informal and spoken in nature than traditional edited written material but overall systematic comparisons are lacking. This short communication contributes systematic comparisons between blog data and spoken and written registers by comparing measures such as type/token ratios and word frequencies. Type/token ratios of blog texts are found to lie between those for interactive speech and formal edited writing, whereas the distribution of words from different frequency bands is closer to the written material. Comparison of the ten most frequent word forms indicates that blog data resembles formal edited writing from a structural perspective, but also suggests that further studies into features of personal involvement may provide additional insights.

Type
Short Communications
Copyright
Copyright © Nordic Association of Linguistics 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Ahlberg, Malin, Andersson, Peter, Forsberg, Markus & Tahmasebi, Nina. 2015. A case study on supervised classification of Swedish pseudo-coordination. Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, 1119. Linköping: Linköping University Electronic Press.Google Scholar
Allwood, Jens. 1998. Some frequency based differences between spoken and written Swedish. Proceedings of the 16th Scandinavian Conference of Linguistics, 1829. Turku: Turku University, Department of Linguistics.Google Scholar
Baron, Naomi. 2009. The myth of impoverished signal: Dispelling the spoken language fallacy for emoticons in online communication. In Vincent, Jane & Fortunati, Leopoldina (eds.), Electronic Emotion: The Mediation of Emotion via Information and Communication Technologies, 107135. Bern: Peter Lang.Google Scholar
Bergh, Gunnar & Ohlander, Sölve. 2012. Free kicks, dribblers and WAGs: Exploring the language of “the people's game”. Moderna språk 106 (1), 1146.Google Scholar
Biber, Douglas & Conrad, Susan. 2009. Register, Genre, and Style. Cambridge: Cambridge University Press.Google Scholar
Biber, Douglas, Conrad, Susan & Reppen, Randi. 1998. Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press.Google Scholar
Biber, Douglas, Egbert, Jesse & Davies, Mark. 2015. Exploring the composition of the searchable web- a corpus-based taxonomy of web registers. Corpora 10 (1), 1145.Google Scholar
Borin, Lars, Forsberg, Markus & Roxendal, Johan. 2012. Korp: The corpus infrastructure of Språkbanken. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), 474478. Istanbul: ELRAGoogle Scholar
Crystal, David. 2006. Language and the Internet, 2nd edn. Cambridge: Cambridge University Press.Google Scholar
Crystal, David. 2011. Internet Linguistics: A Student Guide. London & New York: Routledge.Google Scholar
Egbert, Jesse, Biber, Douglas & Davies, Mark. 2015. Developing a bottom–up, user‐based method of web register classification. Journal of the Association for Information Science and Technology 66 (9), 18171831.Google Scholar
Eide, Stian R., Tahmasebi, Nina & Borin, Lars. 2016. The Swedish Culturomics Gigaword Corpus: A one-billion-word Swedish reference dataset for NLP. Digital Humanities 2016. From Digitization to Knowledge 2016: Resources and Methods for Semantic Processing of Digital Works/Texts, Proceedings of the Workshop, 812. Linköping: Linköping University Electronic Press.Google Scholar
Engdahl, Elisabet & Laanemets, Anu. 2016. Opersonlig passiv i danska, norska och svenska–en korpusstudie [Impersonal passive in Danish, Norwegian and Swedish: A corpus study]. Norsk Lingvistisk Tidsskrift 33 (2), 129156.Google Scholar
Hillbom, Annika. 2015. Känseladjektiv i svenskan: kategorier och figurativa användningar [Touch adjectives in Swedish: Categories and figurative uses]. Språk och stil 25, 127159.Google Scholar
Jensen, Eva Skafte. 2014. Tale er tale; skrift er skrift. Om skriftsproget i de nye medier [Speech is speech; writing is writing: About the written language of new media]. Nydanske Sprogstudier 46, 1138.Google Scholar
Julien, Marit & Lødrup, Helge. 2013. Dobbel passiv og beslektede konstruksjoner i skandinavisk [Double passive and related constructions in Scandinavian]. Norsk lingvistisk tidsskrift 31 (2), 221246.Google Scholar
Ledin, Per & Lyngfelt, Benjamin. 2013. Olika hen-syn. Om bruket av hen i bloggar, tidningstexter och studentuppsatser [Different hen-view: About the use of hen in blogs, newspaper texts and student essays]. Språk och stil 23, 141174.Google Scholar
Myers, Greg. 2010a. The Discourse of Blogs and Wikis. London: Continuum.Google Scholar
Myers, Greg. 2010b. Stance-taking and public discussion in blogs. Critical Discourse Studies 7 (4), 263275.Google Scholar
Olofsson, Joel. 2014. Argument structure constructions and syntactic productivity: The case of Swedish motion constructions. Constructions 1 (7), 117.Google Scholar
Rawoens, Gudrun. 2015. The Swedish connective så att ‘so that’. New Directions in Grammaticalization Research 166, 5165.Google Scholar
Sköldberg, Emma & Hannesdóttir, Anna Helga. 2016. Svenska ord – men vilka? Om uppslagsorden i Svensk ordbok utgiven av Svenska Akademien [Swedish words: But which? About the entry words in the Swedish Academy dictionary]. Svenskans beskrivning 35, Förhandlingar vid trettiofemte sammankomsten, 329340. Göteborg: Göteborgs universitet.Google Scholar
Yates, Simeon. 1996. Oral and written aspects of computer conferencing. In Herring, Susan (ed.), Computer-mediated Communication: Linguistic, Social, and Cross-cultural Perspectives, 2246. Amsterdam: John Benjamins.Google Scholar