No CrossRef data available.
Published online by Cambridge University Press: 08 June 2005
Comparison of languages and linguistic data is essential if progress in our understanding of the nature of spoken languages is to be made. We understand phenomena better through comparison and contrast. This paper discusses problems that arise in trying to transfer a spoken language corpus transcribed and formatted according to one standard into the standard and format of another corpus. The problems that arise are related both to the differences that exist between the standards of the corpora and to human errors leading to lack of reliability in creating the transcriptions. Although the discussion is based on transfer and transliteration between two specific corpora (the Danish BySoc, BySociolingvistisk Korpus, and the Swedish GSLC, Göteborg Spoken Language Corpus), we believe that the discussion in the article documents and highlights problems of a general kind which have to be faced whenever spoken language corpora of different formats are to be compared.