Book contents
- Frontmatter
- Contents
- Preface
- Notes on codes and abbreviations
- 1 Introduction
- 2 Data collection
- 3 The sociolinguistic interview
- 4 Data, data and more data
- 5 The linguistic variable
- 6 Formulating hypotheses/operationalising claims
- 7 The variable rule program: theory and practice
- 8 The how-to's of a variationist analysis
- 9 Distributional analysis
- 10 Multivariate analysis
- 11 Interpreting your results
- 12 Finding the story
- Glossary of terms
- References
- Index
4 - Data, data and more data
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- Notes on codes and abbreviations
- 1 Introduction
- 2 Data collection
- 3 The sociolinguistic interview
- 4 Data, data and more data
- 5 The linguistic variable
- 6 Formulating hypotheses/operationalising claims
- 7 The variable rule program: theory and practice
- 8 The how-to's of a variationist analysis
- 9 Distributional analysis
- 10 Multivariate analysis
- 11 Interpreting your results
- 12 Finding the story
- Glossary of terms
- References
- Index
Summary
What do you do with your data once you have collected it? This chapter will elucidate the procedures for handling a large body of natural speech.
Chapters 1 to 3 have focused on methods for collecting optimal data for analysis. Now it is time to learn what to do with data once you have it. This chapter focuses on data handling and, in particular, techniques for representing speech data in writing.
When faced with a collection of dozens upon dozens of audio-tapes, minidisks or sound files, what do you do next? How can you make the invaluable data contained within maximally accessible and useful?
In this chapter, I focus on tried-and-true procedures from my own experience. I build on the foundations of earlier corpus-building projects (Poplack 1989, Poplack and Tagliamonte 1991). However, I also focus on data arising from fieldwork conducted in the British Isles between 1995 and 2001 (e.g. Tagliamonte 1998, Tagliamonte et al. 2005).
THE CORPUS
The components of a corpus, at least in my own research, are listed in (1):
Components of a corpus
(1)
a. recording media, audio-tapes (analogue, digital) or other
b. interview reports (hard copies) and signed consent forms
c. transcription files (ASCII, Word, txt)
d. a transcription protocol (hard copy and soft)
e. a database of information (FileMaker, Excel, etc.)
f. analysis files (Goldvarb files, token, cel, cnd and res)
The basic substance of a language corpus is the data. Most of my corpora have been collected on audio-tapes and represent one to two hours of conversation between a single interviewer and an informant.
- Type
- Chapter
- Information
- Analysing Sociolinguistic Variation , pp. 50 - 69Publisher: Cambridge University PressPrint publication year: 2006