Skip to main content Accessibility help
×
Hostname: page-component-cd9895bd7-dk4vv Total loading time: 0 Render date: 2024-12-25T18:44:29.284Z Has data issue: false hasContentIssue false

Text Analysis in Python for Social Scientists

Prediction and Classification

Published online by Cambridge University Press:  15 February 2022

Dirk Hovy
Affiliation:
Università Commerciale Luigi Bocconi, Milan

Summary

Text contains a wealth of information about about a wide variety of sociocultural constructs. Automated prediction methods can infer these quantities (sentiment analysis is probably the most well-known application). However, there is virtually no limit to the kinds of things we can predict from text: power, trust, misogyny, are all signaled in language. These algorithms easily scale to corpus sizes infeasible for manual analysis. Prediction algorithms have become steadily more powerful, especially with the advent of neural network methods. However, applying these techniques usually requires profound programming knowledge and machine learning expertise. As a result, many social scientists do not apply them. This Element provides the working social scientist with an overview of the most common methods for text classification, an intuition of their applicability, and Python code to execute them. It covers both the ethical foundations of such work as well as the emerging potential of neural network methods.
Get access
Type
Element
Information
Online ISBN: 9781108960885
Publisher: Cambridge University Press
Print publication: 17 March 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adamson, A. S., & Smith, A. (2018). Machine learning and health care disparities in dermatology. JAMA Dermatology, 154(11), 12471248.Google Scholar
Alowibdi, J. S., Buy, U. A., & Yu, P. (2013). Empirical evaluation of profile characteristics for gender classification on Twitter. In 12th International Conference on Machine Learning and Applications (Volume 1) (pp. 365369).Google Scholar
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias. ProPublica, May, 23.Google Scholar
Atalay, S., El Kihal, S., & Ellsaesser, F. (2019). A natural language processing approach to predicting the persuasiveness of marketing communications. SSRN 3410351.Google Scholar
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.Google Scholar
Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations.Google Scholar
Bamman, D., O’Connor, B., & Smith, N. (2012). Censorship and deletion practices in Chinese social media. First Monday, 17(3).Google Scholar
Bender, E. M., & Friedman, B. (2018). Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, 587604. https://doi.org/10.1162/tacl_a_00041Google Scholar
Berg-Kirkpatrick, T., Burkett, D., & Klein, D. (2012). An empirical investigation of statistical significance in NLP. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (pp. 9951005).Google Scholar
Bhatia, S. (2017). Associative judgment and vector space semantics. Psychological Review, 124(1), 1.Google Scholar
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (pp. 43494357).Google Scholar
Chatsiou, K., & Mikhaylov, S. J. (2020). Deep learning for political science. arXiv preprint arXiv:2005.06540.Google Scholar
Chollet, F. (2017). Deep learning with Python. Manning.Google Scholar
Ciot, M., Sonderegger, M., & Ruths, D. (2013). Gender inference of Twitter users in non-english contexts. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (pp. 1821).Google Scholar
Coavoux, M., Narayan, S., & Cohen, S. B. (2018). Privacy-preserving neural representations of text. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 110).CrossRefGoogle Scholar
Collins, M. (2002). Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (pp. 18). Association for Computational Linguistics. www.aclweb.org/anthology/W02-1001. http://doi.org/10.3115/1118693.1118694.CrossRefGoogle Scholar
Coussement, K., & Van den Poel, D. (2008). Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 34(1), 313327.CrossRefGoogle Scholar
De Choudhury, M., Counts, S., & Horvitz, E. J. (2013). Predicting postpartum changes in emotion and behavior via social media. In Proceedings of the Sigchi Conference on Human Factors in Computing Systems (pp. 32673276).Google Scholar
De Choudhury, M., Counts, S., Horvitz, E. J., & Hoff, A. (2014). Characterizing and predicting postpartum depression from shared facebook data. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 626638).Google Scholar
Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review, 93(3), 283.Google Scholar
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers) (pp. 41714186).Google Scholar
Elazar, Y., & Goldberg, Y. (2018). Adversarial removal of demographic attributes from text data. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 1121).CrossRefGoogle Scholar
Eliashberg, J., Hui, S. K., & Zhang, Z. J. (2007). From story line to box office: A new approach for green-lighting movie scripts. Management Science, 53(6), 881893.Google Scholar
Evans, M., McIntosh, W., Lin, J., & Cates, C. (2007). Recounting the courts? Applying automated content analysis to enhance empirical legal research. Journal of Empirical Legal Studies, 4(4), 10071039.CrossRefGoogle Scholar
Fort, K., Adda, G., & Cohen, K. B. (2011). Last words: Amazon Mechanical Turk: Gold mine or coal mine? Computational Linguistics, 37(2), 413420. www.aclweb.org/anthology/J11-2010. http://doi.org/10.1162/COLI_a_00057.CrossRefGoogle Scholar
Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16), E3635E3644.CrossRefGoogle ScholarPubMed
Gerber, M. S. (2014). Predicting crime using twitter and kernel density estimation. Decision Support Systems, 61, 115125.CrossRefGoogle Scholar
Goldberg, Y. (2016). A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, 345420.Google Scholar
Goldberg, Y. (2017). Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies, 10(1), 1309.Google Scholar
Goldstein, D. G., & Gigerenzer, G. (2002). Models of ecological rationality: The recognition heuristic. Psychological Review, 109(1), 75.Google Scholar
Gonen, H., & Goldberg, Y. (2019, June). Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers)( pp. 609614). www.aclweb.org/anthology/N19-1061. http://doi.org/10.18653/v1/N19-1061.CrossRefGoogle Scholar
Greene, K. T., Park, B., & Colaresi, M. (2019). Machine learning human rights and wrongs: How the successes and failures of supervised learning algorithms can inform the debate about information effects. Political Analysis, 27(2), 223230.CrossRefGoogle Scholar
Harwell, D. (2018). The accent gap. Why some accents don’t work on Alexa or Google Home. The Washington Post. www.washingtonpost.com/graphics/2018/business/alexa-does-not-understand-your-accent/.Google Scholar
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2–3), 6183.Google Scholar
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, SpainGoogle Scholar
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 17351780.Google Scholar
Hofman, J. M., Sharma, A., & Watts, D. J. (2017). Prediction and explanation in social systems. Science, 355(6324), 486488.CrossRefGoogle ScholarPubMed
Hovy, D. (2016). The enemy in your own camp: How well can we detect statistically-generated fake reviews – An adversarial study. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. (pp. 351356). http://doi.org/10.18653/v1/P16-2057CrossRefGoogle Scholar
Hovy, D. (2020). Text analysis in Python for social scientists: Discovery and exploration. Cambridge University Press.Google Scholar
Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy, E. (2013). Learning whom to trust with MACE. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 11201130).Google Scholar
Hovy, D., & Søgaard, A. (2015). Tagging performance correlates with author age. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (pp. 483488).Google Scholar
Hovy, D., & Spruit, S. L. (2016). The social impact of natural language processing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 591598).Google Scholar
Huang, H., Wen, Z., Yu, D., Ji, H., Sun, Y., Han, J., & Li, H. (2013). Resolving entity morphs in censored data. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 10831093).Google Scholar
Humphreys, A., & Wang, R. J.-H. (2017). Automated text analysis for consumer research. Journal of Consumer Research, 44(6), 12741306.Google Scholar
Jonas, H. (1984). The imperative of responsibility: Foundations of an ethics for the technological age (Original in German: Prinzip Verantwortung). University of Chicago Press.Google Scholar
Jørgensen, A., Hovy, D., & Søgaard, A. (2015). Challenges of studying and processing dialects in social media. In Proceedings of the Workshop on Noisy User-Generated Text (pp. 918).Google Scholar
Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020, July). The state and fate of linguistic diversity and inclusion in the NLP world. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 62826293). Association for Computational Linguistics. www.aclweb.org/anthology/2020.acl-main.560. http://doi.org/10.18653/v1/2020.acl-main.560.Google Scholar
Kiritchenko, S., & Mohammad, S. (2018). Examining gender and race bias in two hundred sentiment analysis systems. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics (pp. 4353).Google Scholar
Konečnỳ, J., McMahan, H. B., Yu, F. X., Richtárik, P., Suresh, A. T., & Bacon, D. (2016). Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492.Google Scholar
Kozlowski, A. C., Taddy, M., & Evans, J. A. (2018). The geometry of culture: Analyzing meaning through word embeddings. arXiv preprint arXiv:1803.09288.Google Scholar
Kurita, K., Vyas, N., Pareek, A., Black, A. W., & Tsvetkov, Y. (2019, August). Measuring bias in contextualized word representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing (pp. 166172). Association for Computational Linguistics. www.aclweb.org/anthology/W19-3823. http://doi.org/10.18653/v1/W19-3823.Google Scholar
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning (pp. 11881196).Google Scholar
Levelt, W. J. (1993). Speaking: From intention to articulation (Vol. 1). MIT Press.Google Scholar
Lewis-Kraus, G. (2016). The great AI awakening. The New York Times, 14. www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html.Google Scholar
Li, Y., Baldwin, T., & Cohn, T. (2018). Towards robust and privacy-Preserving text representations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 2530).Google Scholar
Liu, W., & Ruths, D. (2013). What’s in a name? Using first names as features for gender inference in Twitter. In Analyzing Microtext: 2013 AAAI Spring Symposium (1016).Google Scholar
Lucy, L., Demszky, D., Bromley, P., & Jurafsky, D. (2020). Content analysis of textbooks via natural language processing: Findings on gender, race, and ethnicity in Texas U.S. history textbooks. AERA Open, 6(3), 2332858420940312.Google Scholar
Luong, T., Pham, H., & Manning, C. D. (2015, September). Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 14121421). Association for Computational Linguistics. www.aclweb.org/anthology/D15-1166. http://doi.org/10.18653/v1/D15-1166.Google Scholar
Manning, C. D. (2015). Computational linguistics and deep learning. Computational Linguistics, 41(4), 701707.Google Scholar
Marsland, S. (2011). Machine learning: An algorithmic perspective. Chapman and Hall/CRC.Google Scholar
Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417473.Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (pp. 31113119).Google Scholar
Mills, S. (2012). Gender matters: Feminist linguistic analysis. Equinox.Google Scholar
Minsky, M., & Papert, S. A. (1969). Perceptrons. MIT Press.Google Scholar
Mohammady, E., & Culotta, A. (2014). Using county demographics to infer attributes of Twitter users. In Proceedings of the Joint Workshop on Social Dynamics and Personal Attributes in Social Media (pp. 716).CrossRefGoogle Scholar
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of machine learning. MIT Press.Google Scholar
Mosteller, F., & Wallace, D. L. (1963). Inference in an authorship problem: A comparative study of discrimination methods applied to the authorship of the disputed Federalist Papers. Journal of the American Statistical Association, 58(302), 275309.Google Scholar
Munro, R. (2013). NLP for all languages. Idibon Blog, May 22. http://idibon.com/nlp-for-all.Google Scholar
Nguyen, D., Smith, N. A., & Rosé, C. P. (2011). Author age prediction from text using linear regression. In Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (pp. 115123).Google Scholar
Niculae, V., Kumar, S., Boyd-Graber, J., & Danescu-Niculescu-Mizil, C. (2015). Linguistic harbingers of betrayal: A case study on an online strategy game. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 16501659).Google Scholar
Nozza, D., Bianchi, F., & Hovy, D. (2020). What the [MASK]? Making sense of language-specific BERT models. arXiv preprint arXiv:2003.02912.Google Scholar
Park, B., Colaresi, M., & Greene, K. (2018). Beyond a bag of words: Using pulsar to extract judgments on specific human rights at scale. Peace Economics, Peace Science and Public Policy, 24(4).Google Scholar
Park, G., Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Kosinski, M., Stillwell, D. J., … Seligman, M. E. (2015). Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 108(6), 934.Google Scholar
Passonneau, R. J., & Carpenter, B. (2014). The benefits of a model of annotation. Transactions of the Association for Computational Linguistics, 2, 311326. www.aclweb.org/anthology/Q14-1025. http://doi.org/10.1162/tacl_a_00185.Google Scholar
Paun, S., Carpenter, B., Chamberlain, J., Hovy, D., Kruschwitz, U., & Poesio, M. (2018). Comparing Bayesian models of annotation. Transactions of the Association for Computational Linguistics, 6, 571585. https://doi.org/10.1162/tacl_a_00040Google Scholar
Pavlick, E., Post, M., Irvine, A., Kachaev, D., & Callison-Burch, C. (2014). The language demographics of Amazon Mechanical Turk. Transactions of the Association for Computational Linguistics, 2, 7992. www.aclweb.org/anthology/Q14-1007. http://doi.org/10.1162/tacl_a_00167.Google Scholar
Peskov, D., Cheng, B., Elgohary, A., Barrow, J., Danescu-Niculescu-Mizil, C., & Boyd-Graber, J. (2020, July). It takes two to lie: One to lie, and one to listen. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 38113854). Association for Computational Linguistics. www.aclweb.org/anthology/2020.acl-main.353.Google Scholar
Peterson, A., & Spirling, A. (2018). Classification accuracy as a substantive quantity of interest: Measuring polarization in westminster systems. Political Analysis, 26(1), 120128.Google Scholar
Plank, B., Hovy, D., & Søgaard, A. (2014). Learning part-of-speech taggers with inter-annotator agreement loss. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 742751).CrossRefGoogle Scholar
Pomerleau, D. A. (1989). Alvinn: An autonomous land vehicle in a neural network. In Advances in Neural Information Processing Systems (pp. 305313).Google Scholar
Pomerleau, D. A. (2012). Neural network perception for mobile robot guidance (Vol. 239). Springer Science & Business Media.Google Scholar
Prabhakaran, V., Rambow, O., & Diab, M. (2012). Predicting overt display of power in written dialogs. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 518522).Google Scholar
Preotiuc-Pietro, D., Lampos, V., & Aletras, N. (2015a). An analysis of the user occupational class through Twitter content. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 17541764).Google Scholar
Preotiuc-Pietro, D., Volkova, S., Lampos, V., Bachrach, Y., & Aletras, N. (2015b). Studying user income through language, behaviour and affect in social media. PloS One, 10(9), e0138717.Google Scholar
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openaiassets/researchcovers/languageunsupervised/languageunderstandingpaper.pdf.Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9.Google Scholar
Rogaway, P. (2015). The moral character of cryptographic work (Technical Report). IACR-Cryptology ePrint Archive.Google Scholar
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. arXiv preprint arXiv:2002.12327.Google Scholar
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386.Google Scholar
Rosenthal, S., & McKeown, K. (2011). Age prediction in blogs: A study of style, content, and online behavior in pre-and post-social media generations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (Volume 1) (pp. 763772).Google Scholar
Rudinger, R., Naradowsky, J., Leonard, B., & Van Durme, B. (2018). Gender bias in coreference resolution. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers) (pp. 814).Google Scholar
Sap, M., Card, D., Gabriel, S., Choi, Y., & Smith, N. A. (2019, July). The risk of racial bias in hate speech detection. In Proceedings of the 57th Conference of the Association for Computational Linguistics (pp. 16681678). Association for Computational Linguistics. www.aclweb.org/anthology/P19-1163.Google Scholar
Sap, M., Gabriel, S., Qin, L., Jurafsky, D., Smith, N. A., & Choi, Y. (2020, July). Social bias frames: Reasoning about social and power implications of language. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 54775490). Association for Computational Linguistics. www.aclweb.org/anthology/2020.acl-main.486.Google Scholar
Schnoebelen, T. (2013). The weirdest languages. Idibon Blog, June 21. http://idibon.com/the-weirdest-languages.Google Scholar
Shah, D. S., Schwartz, H. A., & Hovy, D. (2020, July). Predictive biases in natural language processing models: A conceptual framework and overview. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 52485264). Association for Computational Linguistics. www.aclweb.org/anthology/2020.acl-main.468. http://doi.org/10.18653/v1/2020.acl-main.468.Google Scholar
Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3), 289310.Google Scholar
Snow, R., O’Connor, B., Jurafsky, D., & Ng, A. (2008, October). Cheap and fast – but is it good? Evaluating non-expert annotations for natural language tasks. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (pp. 254263). Association for Computational Linguistics. www.aclweb.org/anthology/D08-1027.Google Scholar
Solaiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voss, A., Wu, J., … Wang, J. (2019). Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203.Google Scholar
Spärck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1), 1121.Google Scholar
Strubell, E., Ganesh, A., & McCallum, A. (2019, July). Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 36453650). Association for Computational Linguistics. www.aclweb.org/anthology/P19-1355. http://doi.org/10.18653/v1/P19-1355.Google Scholar
Sunstein, C. R. (2004). Precautions against what? The availability heuristic and cross-cultural risk perceptions. University of Chicago John M. Olin Law & Economics Working Paper, No. 220, 422.Google Scholar
Tan, Y. C., & Celis, L. E. (2019). Assessing social and intersectional biases in contextualized word representations. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (pp. 1323013241).Google Scholar
Tatman, R. (2017). Gender and dialect bias in YouTube’s automatic captions. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing (pp. 5359).Google Scholar
Tetreault, J., Burstein, J., & Leacock, C. (2015). Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics. http://aclweb.org/anthology/W15-0600Google Scholar
Tirunillai, S., & Tellis, G. J. (2012). Does chatter really matter? Dynamics of user-generated content and stock performance. Marketing Science, 31(2), 198215.Google Scholar
Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5(2), 207232.Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is all you need. In 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (pp. 59986008).Google Scholar
Vicinanza, P., Goldberg, A., & Srivastava, S. (2020). Who sees the future? A deep learning language model demonstrates the vision advantage of being small. https://doi.org/10.31235/osf.io/j24pwGoogle Scholar
Volkova, S., Bachrach, Y., Armstrong, M., & Sharma, V. (2015, January). Inferring latent user properties from texts published in social media (demo). In Proceedings of the Twenty-Ninth Conference on Artificial Intelligence (pp. 42964297).Google Scholar
Volkova, S., Coppersmith, G., & Van Durme, B. (2014). Inferring user political preferences from streaming communications. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (pp. 186196).Google Scholar
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., & Dean, J. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.Google Scholar
Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12(6), 11001122.Google Scholar
Yatskar, M., Zettlemoyer, L., & Farhadi, A. (2016). Situation recognition: Visual semantic role labeling for image understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 55345542).Google Scholar
Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K.-W. (2017). Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 29792989).Google Scholar

Save element to Kindle

To save this element to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Text Analysis in Python for Social Scientists
  • Dirk Hovy, Università Commerciale Luigi Bocconi, Milan
  • Online ISBN: 9781108960885
Available formats
×

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Text Analysis in Python for Social Scientists
  • Dirk Hovy, Università Commerciale Luigi Bocconi, Milan
  • Online ISBN: 9781108960885
Available formats
×

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Text Analysis in Python for Social Scientists
  • Dirk Hovy, Università Commerciale Luigi Bocconi, Milan
  • Online ISBN: 9781108960885
Available formats
×