Hostname: page-component-cd9895bd7-7cvxr Total loading time: 0 Render date: 2024-12-26T09:41:34.958Z Has data issue: false hasContentIssue false

Predicting word choice in affective text

Published online by Cambridge University Press:  22 May 2015

M. GARDINER
Affiliation:
Macquarie University, North Ryde, NSW 2109, Australia e-mail: mark.dras@mq.edu.au
M. DRAS
Affiliation:
Macquarie University, North Ryde, NSW 2109, Australia e-mail: mark.dras@mq.edu.au

Abstract

Choosing the best word or phrase for a given context from among the candidate near-synonyms, such as slim and skinny, is a difficult language generation problem. In this paper, we describe approaches to solving an instance of this problem, the lexical gap problem, with a particular focus on affect and subjectivity; to do this we draw upon techniques from the sentiment and subjectivity analysis fields. We present a supervised approach to this problem, initially with a unigram model that solidly outperforms the baseline, with a 6.8% increase in accuracy. The results to some extent confirm those from related problems, where feature presence outperforms feature frequency, and immediate context features generally outperform wider context features. However, this latter is somewhat surprisingly not always the case, and not necessarily where intuition might first suggest; and an analysis of where document-level models are in some cases better suggested that, in our corpus, broader features related to the ‘tone’ of the document could be useful, including document sentiment, document author, and a distance metric for weighting the wider lexical context of the gap itself. From these, our best model has a 10.1% increase in accuracy, corresponding to a 38% reduction in errors. Moreover, our models do not just improve accuracy on affective word choice, but on non-affective word choice also.

Type
Articles
Copyright
Copyright © Cambridge University Press 2015 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The authors would like to thank the anonymous reviewers of the article, and to acknowledge the support of ARC Discovery grant DP0558852.

References

Banerjee, S., and Pedersen, T. 2003. The design, implementation and use of the Ngram statistics package. In Proceedings of the 4th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City.Google Scholar
Bieler, H., Dipper, S., and Stede, M., 2007. Identifying formal and functional zones in film reviews. In Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, Antwerp, Belgium, pp. 7585.Google Scholar
Brants, T., and Franz, A. 2006. Web 1T 5-gram Version 1. http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T13.Google Scholar
Carbonell, J., Klein, S., Miller, D., Steinbaum, M., Grassiany, T., and Frei, J., 2006. Context-based machine translation. In Proceedings of the 7th Conference of the Association for Machine Translation of the Americas (AMTA), Cambridge, MA, US, pp. 1928.Google Scholar
Cerini, S., Compagnoni, V., Demontis, A., Formentelli, M., and Gandi, C. 2007. Micro-WNOp: a gold standard for the evaluation of automatically compiled lexical resources for opinion mining. In Language Resources and Linguistic Theory: Typology, Second Language Acquisition, English Linguistics, Franco Angeli, Milan, Italy.Google Scholar
Church, K., Gale, W., Hanks, P., and Hindle, D. 1989. Parsing, word associations and typical predicate-argument relations. In Proceedings of the International Workshop on Parsing Technologies, Pittsburgh, PA, US.Google Scholar
Church, K., Gale, W., Hanks, P., and Hindle, D. 1991. Using statistics in lexical analysis. In Zernick, U. (ed.), Lexical Acquisition: Using On-line Resources to Build a Lexicon, pp. 115164. Lawrence Erlbaum Associates, Hillsdale, NJ, US.Google Scholar
Church, K., and Hanks, P. 1991. Word association norms, mutual information and lexicography. Computational Linguistics 16 (1): 2229.Google Scholar
Clarke, C. L. A., and Terra, E. L., 2003. Passage retrieval versus document retrieval for factoid question answering. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada, pp. 427428.Google Scholar
DiMarco, C., Hirst, G., and Stede, M., 1993. The semantic and stylistic differentiation of synonyms and near-synonyms. In Proceedings of AAAI Spring Symposium on Building Lexicons for Machine Translation, Stanford, CA, USA, pp. 114121.Google Scholar
Edmonds, P. 1997. Choosing the word most typical in context using a lexical co-occurrence network. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, Spain, pp. 507509. Association for Computational Linguistics.CrossRefGoogle Scholar
Edmonds, P. 1999. Semantic Representations of Near-Synonyms for Automatic Lexical Choice. PhD thesis, University of Toronto, Toronto, Canada.Google Scholar
Edmonds, P., and Hirst, G., 2002. Near-synonymy and lexical choice. Computational Linguistics 28 (2): 105144.CrossRefGoogle Scholar
Esuli, A., and Sebastiani, F., 2006. SentiWordNet: a publicly available lexical resource for opinion mining. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genova, Italy, pp. 417422.Google Scholar
Fellbaum, C. (ed.) 1998. WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, US.CrossRefGoogle Scholar
Foody, G. M. 2008. Sample size determination for image classification accuracy assessment and comparison. In Proceedings of the 8th International Symposium on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences, Shanghai, pp. 154162.Google Scholar
Gale, W. A., and Sampson, G., 1995. Good-turing frequency estimation without tears. Journal of Quantitative Linguistics 2: 217232.CrossRefGoogle Scholar
Gallo, C. G., Jaeger, T. F., and Smyth, R., 2008. Incremental syntactic planning across clauses. In Proceedings of the 30th Annual Meeting of the Cognitive Science Society (CogSci08), Washington, DC, US, pp. 845850.Google Scholar
Gamon, M. 2004. Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In COLING ’04: Proceedings of the 20th International Conference on Computational Linguistics, Morristown, NJ, USA, pp. 841847. Association for Computational Linguistics.CrossRefGoogle Scholar
Gardiner, M., and Dras, M., 2007a. Corpus statistics approaches to discriminating among near-synonyms. In Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics (PACLING 2007), Melbourne, Australia, pp. 3139.Google Scholar
Gardiner, M., and Dras, M., 2007b. Exploring approaches to discriminating among near-synonyms. In Proceedings of the Australasian Language Technology Workshop 2007, Melbourne, Australia, pp. 3139.Google Scholar
Genzel, D., and Charniak, E., 2002. Entropy rate constancy in text. In Proceedings of the 40th Annual Meetings of the Association for Computational Linguistics (ACL’02), Philadelphia, US, pp. 199206.Google Scholar
Genzel, D., and Charniak, E., 2003. Variation of entropy and parse trees of sentences as a function of the sentence number. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Sapporo, Japan, pp. 6572.Google Scholar
Good, I. J., 1953. The population frequencies of species and the estimation of population parameters. Biometrika 40 (3–4): 237264.CrossRefGoogle Scholar
Hassan, S., Csomai, A., Banea, C., Sinha, R., and Mihalcea, R. 2007. UNT: SubFinder: combining knowledge sources for automatic lexical substitution. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, pp. 410413. Association for Computational Linguistics.CrossRefGoogle Scholar
Hatzivassiloglou, V., and Wiebe, J. M., 2000. Effects of adjective orientation and gradability on sentence subjectivity. In Proceedings of the 18th International Conference on Computational Linguistics (COLING-2000), Saarbrcken, Germany, pp. 299305.CrossRefGoogle Scholar
Hawker, T., Gardiner, M., and Bennetts, A., 2007. Practical queries of a massive n-gram database. In Proceedings of the Australasian Language Technology Workshop 2007, Melbourne, Australia, pp. 4048.Google Scholar
Hayakawa, S. I. (ed.) 1968. Use The Right Word: Modern Guide to Synonyms and Related Words, 1st ed.The Reader’s Digest Association Pty. Ltd., New York, NY, US.Google Scholar
Hayakawa, S. I. (ed.) 1994. Choose the Right Word (2nd edition. Harper Collins Publishers. revised by Eugene Ehrlich, New York, NY, US.Google Scholar
Ide, N., and Vronis, J., 1998. Introduction to the special issue on word sense disambiguation: the state of the Art. Computational Linguistics 24 (1): 140.Google Scholar
Inkpen, D. 2007a. Near-synonym choice in an intelligent thesaurus. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, New York, pp. 356363. Association for Computational Linguistics.Google Scholar
Inkpen, D., 2007b. A statistical model for near-synonym choice. ACM Transactions of Speech and Language Processing 4 (1): 117.CrossRefGoogle Scholar
Inkpen, D., and Hirst, G., 2006. Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics 32 (2): 223262.CrossRefGoogle Scholar
Inkpen, D. Z., Feiguina, O., and Hirst, G. 2006. Generating more-positive or more-negative text. In Shanahan, J. G., Qu, Y., and Wiebe, J. (eds.), Computing Attitude and Affect in Text (Selected papers from the Proceedings of the Workshop on Attitude and Affect in Text, AAAI 2004 Spring Symposium), pp. 187196. Springer, Dordrecht, The Netherlands, Dordrecht, The Netherlands.Google Scholar
Islam, A., and Inkpen, D., 2010. Near-synonym choice using a 5-gram language model. Research in Computing Science: Special issue on Natural Language Processing and its Applications 46: 4152.Google Scholar
Islam, M. A. 2011. An Unsupervised Approach to Detecting and Correcting Errors in Text. PhD thesis, University of Ottawa, Ottawa, Canada.Google Scholar
Joachims, T. 1999. Making large-scale SVM learning practical. In Schlkopf, B., Burges, C. J., and Smola, A. J. (eds.), Advances in Kernel Methods - Support Vector Learning, pp. 169184. Cambridge, USA: The MIT Press.Google Scholar
Jurafsky, D., and Martin, J. H. 2009. Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics, 2nd ed.Prentice-Hall, Upper Saddle River, NJ, USA.Google Scholar
Katz, S. M., 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech and Signal Processing 35: 400401.CrossRefGoogle Scholar
Keller, F. 2004. The entropy rate principle as a predictor of processing effort: an evaluation against eye-tracking data. In Lin, D., and Wu, D. (eds.), Proceedings of EMNLP 2004, Barcelona, Spain, pp. 317324. Association for Computational Linguistics.Google Scholar
Koppel, M., Akiva, N., and Dagan, I., 2006a. Feature instability as a criterion for selecting potential style markers. Journal of the American Society for Information Science and Technology 57 (11): 15191525.CrossRefGoogle Scholar
Koppel, M., Akiva, N., and Dagan, I., 2006b. Feature Instability as a Criterion for Selecting Potential Style Markers. Journal of the American Society for Information Science and Technology 57 (11): 15191525.CrossRefGoogle Scholar
Kullback, S., and Leibler, R. A., 1951. On Information and Sufficiency. The Annals of Mathematical Statistics 22: 7986.CrossRefGoogle Scholar
Landauer, T., and Dumais, S., 1997. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104 (2): 211240.CrossRefGoogle Scholar
Langkilde, I., and Knight, K., 1998. The practical value of N-grams in generation. In Proceedings of the 9th International Natural Language Generation Workshop, Niagra-on-the-Lake, Canada, pp. 248255.Google Scholar
Levy, R., and Jaeger, T. F. 2007. Speakers optimize information density through syntactic reduction. In Schlkopf, B., Platt, J., and Hoffman, T. (eds.), Advances in Neural Information Processing Systems 19, Cambridge, MA: MIT Press.Google Scholar
Liu, Y., and Zheng, Y. F. 2005. One-against-all multi-class SVM classification using reliability measures. In Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, (IJCNN ’05). vol. 2, pp. 849854.Google Scholar
McCarthy, D., and Navigli, R., 2007. SemEval-2007 Task 10: english lexical substitution task. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic. Association for Computational Linguistics, pp. 4853.CrossRefGoogle Scholar
Özgür, L., and Güngör, T. 2010. Text classification with the support of pruned dependency patterns. Pattern Recognition Letters 31 (12): 15981607.CrossRefGoogle Scholar
Paltoglou, G., and Thelwall, M. 2010. A study of information retrieval weighting schemes for sentiment analysis. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 13861395. Association for Computational Linguistics.Google Scholar
Pang, B., and Lee, L., 2004. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL’04), Main Volume, Barcelona, Spain, pp. 271278.Google Scholar
Pang, B., and Lee, L. 2005. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, Michigan, pp. 115124. Association for Computational Linguistics.Google Scholar
Pang, B., and Lee, L., 2008. Opinion mining and sentiment nnalysis. Foundations and Trends in Information Retrieval 2 (1–2): 1135.CrossRefGoogle Scholar
Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pp. 7986. Association for Computational Linguistics, Philadelphia, PA, US.Google Scholar
Qian, T., and Jaeger, T. F. 2010. Close = Relevant? The role of context in efficient language production. In Proceedings of the 2010 Workshop on Cognitive Modeling and Computational Linguistics, Uppsala, Sweden, pp. 4553. Association for Computational Linguistics.Google Scholar
Rapp, R., 2008. The automatic generation of thesauri of related words for English, French, German, and Russian. International Journal of Speech Technology 11 (3–4): 147156.CrossRefGoogle Scholar
Refaeilzadeh, P., Tang, L., and Liu, H. 2009. Cross validation. In Tamer, M., and Liu, L. (eds.), Encyclopedia of Database Systems. Springer, New York, NY, US.Google Scholar
Reiter, E., and Dale, R. 2000. Building Natural Language Generation Systems. Cambridge University Press, Cambridge, UK.CrossRefGoogle Scholar
Rifkin, R., and Klautau, A., 2004. In defense of one-vs-all classification. Journal of Machine Learning Research 5 (2): 101141.Google Scholar
Rosen-Zvi, M., Griffiths, T., Steyvers, M., and Smyth, P., 2004. The author-topic model for authors and documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, Banff, Canada, pp. 487494.Google Scholar
Salzberg, S. L., 1997. On comparing classifiers: pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1: 317327.CrossRefGoogle Scholar
Sinclair, J. 1987. The nature of the evidence. In Sinclair, J. M. (ed.), Looking Up: An Account of the COBUILD Project in Lexical Computing and the Development of the Collins COBUILD English Language Dictionary, pp. 150159. London, UK: HarperCollins Publishers Ltd.Google Scholar
Sinclair, J., 1991. Corpus, Concordance, Collocation. Oxford, UK: Oxford University Press.Google Scholar
Sinha, R. S., and Mihalcea, R., 2014. Explorations in lexical sample and all-words lexical substitution. Natural Language Engineering 20 (1): 99129.CrossRefGoogle Scholar
Snyder, B., and Barzilay, R. 2007. Multiple aspect ranking using the good grief algorithm. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, New York, pp. 300307. Association for Computational Linguistics.Google Scholar
Sprent, P., and Smeeton, N. C. 2007. Applied Nonparametric Statistical Methods, 4th ed. Texts in Statistical Science. Chapman and Hall/CRC, Boca Raton, FL, US.Google Scholar
Stewart, D., 2010. Semantic Prosody: A Critical Evaluation. New York, US: Routledge.CrossRefGoogle Scholar
Stone, P. J., Dunphy, D. C., Smith, M. S., and Ogilvie, D. M. 1966. General Inquirer: A Computer Approach to Content Analysis. The MIT Press, Cambridge, MA, US.Google Scholar
Stubbs, M., 2001. Words and Phrases: Corpus Studies of Lexical Semantics. Oxford, UK: Blackwell Publishing.Google Scholar
Turney, P. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 417424. Association for Computational Linguistics.Google Scholar
Wang, T., and Hirst, G. 2010. Near-synonym lexical choice in latent semantic space. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, China, pp. 11821190. Coling 2010 Organizing Committee.Google Scholar
Wiebe, J., Wilson, T., Bruce, R., Bell, M., and Martin, M., 2004. Learning subjective language. Computational Linguistics 30 (3): 277308.CrossRefGoogle Scholar
Xu, P., Chelba, C., and Jelinek, F., 2002. A study on richer syntactic dependencies for structured language modeling. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02), Philadelphia, PA, US, pp. 191198.Google Scholar
Yang, Y., and Pedersen, J. O. 1997. A comparative study on feature selection in text categorization. In Proceedings of the Fourteenth International Conference on Machine Learning, San Francisco, USA, pp. 412420. Morgan Kaufmann Publishers Inc.Google Scholar
Yu, H., and Hatzivassiloglou, V., 2003. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP 2003), Sapporo, Japan, pp. 129136.CrossRefGoogle Scholar
Yu, L.-C., Shih, H.-M., Lai, Y.-L., Yeh, J.-F., and Wu, C.-H. 2010. Discriminative training for near-synonym substitution. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, China, pp. 12541262. Coling 2010 Organizing Committee.Google Scholar
Yuret, D. 2007. KU: word sense disambiguation by substitution. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, pp. 207214. Association for Computational Linguistics.CrossRefGoogle Scholar