Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-13T21:32:51.976Z Has data issue: false hasContentIssue false

Testing the Validity of Automatic Speech Recognition for Political Text Analysis

Published online by Cambridge University Press:  19 February 2019

Sven-Oliver Proksch*
Affiliation:
Cologne Center for Comparative Politics, University of Cologne, Germany. Email: so.proksch@uni-koeln.de
Christopher Wratil
Affiliation:
Cologne Center for Comparative Politics, University of Cologne, Germany. Email: so.proksch@uni-koeln.de Minda de Gunzburg Center for European Studies, Harvard University, Cambridge, MA 02138, USA
Jens Wäckerle
Affiliation:
Cologne Center for Comparative Politics, University of Cologne, Germany. Email: so.proksch@uni-koeln.de

Abstract

The analysis of political texts from parliamentary speeches, party manifestos, social media, or press releases forms the basis of major and growing fields in political science, not least since advances in “text-as-data” methods have rendered the analysis of large text corpora straightforward. However, a lot of sources of political speech are not regularly transcribed, and their on-demand transcription by humans is prohibitively expensive for research purposes. This class includes political speech in certain legislatures, during political party conferences as well as television interviews and talk shows. We showcase how scholars can use automatic speech recognition systems to analyze such speech with quantitative text analysis models of the “bag-of-words” variety. To probe results for robustness to transcription error, we present an original “word error rate simulation” (WERSIM) procedure implemented in $R$. We demonstrate the potential of automatic speech recognition to address open questions in political science with two substantive applications and discuss its limitations and practical challenges.

Type
Articles
Copyright
Copyright © The Author(s) 2019. Published by Cambridge University Press on behalf of the Society for Political Methodology. 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Authors’ note: We are grateful to Leonie Diffené, Felix Reich and Pit Rieger for their excellent research assistance. We are also very thankful for helpful comments on earlier versions of this work by two anonymous reviewers as well as Jeff Gill. Christopher Wratil would like to acknowledge funding by the Fritz Thyssen Stiftung (20.16.0.045WW). All remaining errors are our own. The replication files for this article are available on the Political Analysis Dataverse (Proksch, Wratil, and Wäckerle 2018).

Contributing Editor: Jeff Gill

References

Abou-Chadi, T. 2016. “Niche Party Success and Mainstream Party Policy Shifts—How Green and Radical Right Parties Differ in their Impact.” British Journal of Political Science 46(2):417436.Google Scholar
Accornero, G., and Ramos Pinto, P.. 2015. “Mild Mannered? Protest and Mobilisation in Portugal Under Austerity, 2010–2013.” West European Politics 38(3):491515.Google Scholar
Baum, M. A. 2005. “Talking the Vote: Why Presidential Candidates Hit the Talk Show Circuit.” American Journal of Political Science 49(2):213234.Google Scholar
Baum, M. A., and Jamison, A. S.. 2006. “The Oprah Effect: How Soft News Helps Inattentive Citizens Vote Consistently.” Journal of Politics 68(4):946959.Google Scholar
Benoit, K., Laver, M., and Mikhaylov, S.. 2009. “Treating Words as Data with Error: Uncertainty in Text Statements of Policy Positions.” American Journal of Political Science 53(2):495513.Google Scholar
Cook, J. R., and Stefanski, L. A.. 1994. “Simulation-Extrapolation Estimation in Parametric Measurement Error Models.” Journal of the American Statistical Association 89(428):13141328.Google Scholar
Daku, M., Soroka, S., and Young, L.. 2015. “Lexicoder, version 3.0.” www.lexicoder.com.Google Scholar
de Vries, E., Schoonvelde, M., and Schumacher, G.. 2018. “No Longer Lost in Translation. Evidence that Google Translate Works for Comparative Bag-of-Words Text Applications.” Political Analysis 26(4):417430.Google Scholar
Denny, M. J., and Spirling, A.. 2018. “Text Preprocessing for Unsupervised Learning: Why It Matters, When It Misleads, and What to Do About It.” Political Analysis 26(2):168189.Google Scholar
Dietrich, B. J., Hayes, M., and O’Brien, D. Z.. 2017. “Pitch Perfect: Vocal Pitch and the Emotional Intensity of Congressional Speech on Women.” Working Paper.Google Scholar
Dietrich, B. J., Enos, R. D., and Sen, M.. 2018. “Emotional Arousal Predicts Voting on the U.S. Supreme Court.” Political Analysis , https://doi.org/10.1017/pan.2018.47.Google Scholar
Dolezal, M., Hutter, S., and Becker, R.. 2016. “Protesting European Integration: Politicisation from Below? In Politicising Europe: Integration and Mass Politics , edited by Hutter, Swen, Grande, Edgar, and Kriesi, Hanspeter, 112136. Cambridge: Cambridge University Press, chapter 5.Google Scholar
Dresing, T., Pehl, T., and Schmieder, C.. 2015. “Manual (on) Transcription. Transcription Conventions, Software Guides and Practical Hints for Qualitative Researchers.” 3rd English Edition: Marburg. http://www.audiotranskription.de/english/transcription-practicalguide.htm.Google Scholar
Dür, A., and Mateo, G.. 2010. “Bargaining Power and Negotiation Tactics: The Negotiations on the EU’s Financial Perspective, 2007–2013.” Journal of Common Market Studies 48(3):557578.Google Scholar
Edmondson, M.2017. “Package ‘googleLanguageR’.” Technical report.Google Scholar
European Council 2006. “An Overall Policy on Transparency.” http://www.consilium.europa.eu/ueDocs/cms_Data/docs/pressData/en/misc/90112.pdf.Google Scholar
Fernandez-Vazquez, P. 2014. “And Yet it Moves: The Effect of Election Platforms on Party Policy Images.” Comparative Political Studies 47(14):19191944.Google Scholar
Gibson, R. K., and McAllister, I.. 2011. “Do Online Election Campaigns Win Votes? The 2007 Australian YouTube Election.” Political Communication 28(2):227244.Google Scholar
Giugni, M., and Grasso, M.. 2015. Austerity and Protest: Popular Contention in Times of Economic Crisis . Routledge.Google Scholar
Grimmer, J. 2010. “A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases.” Political Analysis 18(1):135.Google Scholar
Grimmer, J., and Stewart, B. M.. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21(3):267297.Google Scholar
Haynes, A. A., and Pitts, B.. 2009. “Making an Impression: New Media in the 2008 Presidential Nomination Campaigns.” PS—Political Science and Politics 42(1):5358.Google Scholar
Herzog, A., and Benoit, K.. 2015. “The Most Unkindest Cuts: Speaker Selection and Expressed Government Dissent During Economic Crisis.” The Journal of Politics 77(4):11571175.Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., and Kingsbury, B.. 2012. “Deep Neural Networks for Acoustic Modeling in Speech Recognition.” IEEE Signal Processing Magazine (November):8297.Google Scholar
Hochreiter, S., and Schmidhuber, J.. 1997. “Long Short-Term Memory.” Neural Computation 9(8):17351780.Google Scholar
Këpuska, V., and Bohouta, G.. 2017. “Comparing Speech Recognition Systems (Microsoft API, Google API and CMU Sphinx).” International Journal of Engineering Research and Applications 7(3):2024.Google Scholar
Knox, D., and Lucas, C.. 2018. “A Dynamic Model of Speech for The Social Sciences.” Working Paper.Google Scholar
Kriesi, H., Grande, E., Dolezal, M., Helbling, M., Hoglinger, D., Hutter, S., and Wueest, B.. 2012. Political Conflict in Western Europe . Cambridge: Cambridge University Press.Google Scholar
Lauderdale, B. E., and Herzog, A.. 2016. “Measuring Political Positions from Legislative Speech.” Political Analysis 24(3):374394.Google Scholar
Lecun, Y., Bengio, Y., and Hinton, G.. 2015. “Deep Learning.” Nature 521(7553):436444.Google Scholar
Levenshtein, V. I. 1966. “Binary Codes Capable of Correcting Deletions, Insertions, and Reversals.” Soviet Physics Doklady 10(8):707710.Google Scholar
Liao, H., McDermott, E., and Senior, A.. 2013. “Large Scale Deep Neural Network Acoustic Modeling with Semi-Supervised Training Data for YouTube Video Transcription.” In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013—Proceedings , 368373.Google Scholar
Martin, A. D., and Quinn, K. M.. 2002. “Dynamic Ideal Point Estimation via Markov Chain Monte Carlo for The U.S. Supreme Court, 1953–1999.” Political Analysis 10(2):134153.Google Scholar
McKibben, H. E. 2016. “To Link or Not to Link? Agenda Change in International Bargaining.” British Journal of Political Science 46(2):371393.Google Scholar
Meeker, M.2017. “Kleiner Perkins Internet Trends 2017.” Technical report. http://www.kpcb.com/internet-trends.Google Scholar
Meguid, B. M. 2005. “Competition Between Unequals: The Role of Mainstream Party Strategy in Niche Party Success.” American Political Science Review 99(3):347359.Google Scholar
Meguid, B. M. 2008. Party Competition between Unequals: Strategies and Electoral Fortunes in Western Europe . Cambridge: Cambridge University Press.Google Scholar
Plummer, M. 2003. “JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling.” In Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003) , 2022.Google Scholar
Polk, J., Rovny, J., Bakker, R., Edwards, E., Hooghe, L., Jolly, S., Koedam, J., Kostelka, F., Marks, G., Schumacher, G., Steenbergen, M., Vachudova, M., and Zilovic, M.. 2017. “Explaining the Salience of Anti-Elitism and Reducing Political Corruption for Political Parties in Europe with the 2014 Chapel Hill Expert Survey Data.” Research & Politics 19.Google Scholar
Proksch, S.-O., Lowe, W., Wäckerle, J., and Soroka, S.. 2018. “Multilingual Sentiment Analysis: A New Approach to Measuring Conflict in Parliamentary Speeches.” Legislative Studies Quarterly , https://doi.org/10.1111/lsq.12218.Google Scholar
Proksch, S.-O., Wratil, C., and Wäckerle, J.. 2018. “Replication Data for: ‘Testing the Validity of Automatic Speech Recognition for Political Text Analysis’ by Sven-Oliver Proksch, Christopher Wratil and Jens Wäckerle.” https://doi.org/10.7910/DVN/PGQY2F, Harvard Dataverse, V1.Google Scholar
Proksch, S. O., and Slapin, J. B.. 2012. “Institutional Foundations of Legislative Speech.” American Journal of Political Science 56(3):520537.Google Scholar
Proksch, S. O., and Slapin, J. B.. 2015. The Politics of Parliamentary Debate: Parties, Rebels, and Representation . Cambridge: Cambridge University Press.Google Scholar
Proksch, S.-O., Lowe, W., Wäckerle, J., and Soroka, S.. 2018. “Multilingual Sentiment Analysis: A New Approach to Measuring Conflict in Parliamentary Speeches.” Legislative Studies Quarterly, https://doi.org/10.1111/lsq.12218.Google Scholar
Roberts, M. E., Stewart, B. M., Tingley, D., Lucas, C., Leder-Luis, J., Gadarian, S. K., Albertson, B., and Rand, D. G.. 2014. “Structural Topic Models for Open-Ended Survey Responses.” American Journal of Political Science 58(4):10641082.Google Scholar
Schnakenberg, K. E., and Fariss, C. J.. 2014. “Dynamic Patterns of Human Rights Practices.” Political Science Research and Methods 2(1):131.Google Scholar
Schneider, C. J. 2011. “Weak States and Institutionalized Bargaining Power in International Organizations.” International Studies Quarterly 55(2):331355.Google Scholar
Schneider, C. J. 2013. “Globalizing Electoral Politics: Political Competence and Distributional Bargaining in the European Union.” World Politics 65(3):452490.Google Scholar
Slapin, J. B., and Proksch, S.-O.. 2008. “A Scaling Model for Estimating Time-Series Party Positions from Texts.” American Journal of Political Science 52(3):705722.Google Scholar
Soltau, H., Liao, H., and Sak, H.. 2017. “Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, August , 37073711.Google Scholar
Sood, G.2018. “Package ‘tubr’.” Technical report.Google Scholar
Stenbæk, J., and Jensen, M. D.. 2016. “Evading the Joint Decision Trap: The Multiannual Financial Framework 2014–20.” European Political Science Review 8(4):615635.Google Scholar
Thomson, R., Royed, T., Naurin, E., Artés, J., Costello, R., Ennser-Jedenastik, L., Ferguson, M., Kostadinova, P., Moury, C., Pétry, F., and Praprotnik, K.. 2017. “The Fulfillment of Parties’ Election Pledges: A Comparative Study on the Impact of Power Sharing.” American Journal of Political Science 61(3):527542.Google Scholar
Wratil, C., and Hobolt, S. B.. Forthcoming. “Public Deliberations in The Council of the European Union: Introducing and Validating the DICEU Approach.” European Union Politics 127.Google Scholar
Young, L., and Soroka, S.. 2012. “Affective News: The Automated Coding of Sentiment in Political Texts.” Political Communication 29(2):205231.Google Scholar
Yu, D., and Deng, L.. 2015. Automatic Speech Recognition . Signals and Communication Technology London. London: Springer.Google Scholar
Zittel, T. 2015. “Do Candidates Seek Personal Votes on the Internet? Constituency Candidates in the 2009 German Federal Elections.” German Politics 24(4):435450.Google Scholar
Supplementary material: File

Proksch et al. supplementary material

Proksch et al. supplementary material 1
Download Proksch et al. supplementary material(File)
File 253.3 KB