Book contents
- Frontmatter
- Contents
- List of figures
- Acknowledgements
- Abbreviations
- Technical glossary
- Introduction : the challenge of reintegration in political history
- 1 On method : text-mining, corpora and the historical study of language
- 2 The impact of reform : the general elections of 1880 and 1885
- 3 The impact of home rule : the general elections of 1886 and 1892
- 4 The impact of imperialism : the general elections of 1895 and 1900
- 5 The impact of New Liberalism : the general elections of 1906 and 19101
- Conclusion: who won the war of words?
- Appendix 1 Technical and methodological
- Appendix 2 Statistical
- Bibliography
- Index
Appendix 1 - Technical and methodological
Published online by Cambridge University Press: 26 April 2020
- Frontmatter
- Contents
- List of figures
- Acknowledgements
- Abbreviations
- Technical glossary
- Introduction : the challenge of reintegration in political history
- 1 On method : text-mining, corpora and the historical study of language
- 2 The impact of reform : the general elections of 1880 and 1885
- 3 The impact of home rule : the general elections of 1886 and 1892
- 4 The impact of imperialism : the general elections of 1895 and 1900
- 5 The impact of New Liberalism : the general elections of 1906 and 19101
- Conclusion: who won the war of words?
- Appendix 1 Technical and methodological
- Appendix 2 Statistical
- Bibliography
- Index
Summary
Three corpora form the primary engine for the text-mining used in this book. The first is the ‘East Anglian corpus’ which is composed of election-perelection subsamples of constituency Conservative and Liberal speech for the years 1880 to 1910. It contains approximately a million words. The speeches were taken from the Norfolk and Suffolk press, and each subsample contains equal word-counts per party, and for each of the region's sixteen constituencies. The second is the ‘National Speaker corpus’. This is composed of all the extra-parliamentary orations of frontbench politicians (i.e. the leading lights of the main parties who often held cabinet or shadow cabinet level positions) delivered during election campaigns that were reported in The Times. It is similarly subdivided by party and general election year, and contains approximately 1.5 million words. The third is the Constituencies corpus, which is approximately 1.8 million words in size. It contains subsamples of approximately 75,000–100,000 words per party, per election. Speeches in this corpus were selected according to the digital availability of newspapers through the British Newspaper Archive.
The book also makes use of several special supplementary corpora. The most important are the ‘Liberal Unionist corpus’ and the ‘Labour corpus’. However, others are occasionally employed for in-depth analyses of specific topics: for example Chapter 2 uses two corpora for East Anglia for 1835 and 1874, and Chapter 4 employs a ‘Pro-Boer corpus’. The Liberal Unionist and Labour corpora are introduced below, but other special supplementary corpora are introduced individually in the main text when they are utilised. In all corpora (main and supplementary) the numerical results generated from each subsection – for example East Anglian Conservative speeches in 1895, national Liberal speeches in 1900, constituencies Conservatives in 1885 or Liberal Unionists in 1886 – are weighted to ratios of 50,000 words per election subsample to enable direct like-for-like comparisons.
All corpora are machine-readable text files. They were interrogated primarily with Antconc (a free, simple and powerful corpus analysis program) but other software was occasionally employed such as Mallet and Google NGram.
Anatomy: East Anglian corpus
The East Anglian corpus contains approximately a million words of speech from 1880 to 1910, digitally scanned from newspapers. It is subdivided between the two parties and nine general elections, so has eighteen subsections It was compiled according to strict criteria, with each Norfolk and Suffolk constituency equally represented for each party at each election.
- Type
- Chapter
- Information
- The War of WordsThe Language of British Elections, 1880–1914, pp. 241 - 247Publisher: Boydell & BrewerPrint publication year: 2020