Hostname: page-component-cd9895bd7-gbm5v Total loading time: 0 Render date: 2024-12-26T08:11:08.748Z Has data issue: false hasContentIssue false

On the interpretation of noun compounds: Syntax, semantics, and entailment

Published online by Cambridge University Press:  28 May 2013

PRESLAV NAKOV*
Affiliation:
Qatar Computing Research Institute, Qatar Foundation Tornado Tower, Floor 10, PO Box 5825, Doha, Qatar e-mail: pnakov@qf.org.qa

Abstract

We discuss the problem of interpreting noun compounds such as colon cancer tumor suppressor protein, which pose major challenges for the automatic interpretation of English written text. We present an overview of the more general process of compounding and of noun compounds in particular, as well as of their syntax and semantics from both theoretical and computational linguistics viewpoint with an emphasis on the latter. Our main focus is on computational approaches to the syntax and semantics of noun compounds: we describe the problems, present the challenges, and discuss the most important lines of research. We also show how understanding noun compound syntax and semantics could help solve textual entailment problems, which would be potentially useful for a number of NLP applications, and which we believe to be an important direction for future research.

Type
Articles
Copyright
Copyright © Cambridge University Press 2013 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Algeo, J. (ed.) 1991. Fifty Years Among the New Words. Cambridge, UK: Cambridge University Press.Google Scholar
Baldwin, T. and Tanaka, T. 2004. Translation by machine of compound nominals: getting it right. In Proceedings of the ACL 2004 Workshop on Multiword Expressions: Integrating Processing, MWE ‘04, Barcelona, Spain, pp. 2431.Google Scholar
Barker, K. and Szpakowicz, S. 1998. Semi-automatic recognition of noun modifier relationships. In Proceedings of the 17th International Conference on Computational Linguistics, ICCL ‘98, Chicago, IL, pp. 96102.Google Scholar
Baroni, M. and Zamparelli, R. 2010. Nouns are vectors, adjectives are matrices: representing adjective–noun constructions in semantic space. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ‘10, Cambridge, MA, pp. 1183–93.Google Scholar
Bauer, L. 1983. English Word-Formation. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Bauer, L. 2006. Compound. Linguistics and Philosophy 17: 329–42.Google Scholar
Booij, G. 2005. The Grammar of Words: An Introduction to Linguistic Morphology. Oxford Linguistics. Oxford: Oxford University Press.Google Scholar
Brants, T. and Franz, A. 2006. Web 1T 5-gram corpus version 1.1. Technical Report, Linguistic Data Consortium, Philadelphia, PA.Google Scholar
Butnariu, C., Kim, S. N., Nakov, P., Ó Séaghdha, D., Szpakowicz, S., and Veale, T. 2009. SemEval-2010 task 9: the interpretation of noun compounds using paraphrasing verbs and prepositions. In Proceedings of the NAACL-HLT-09 Workshop on Semantic Evaluations: Recent Achievements and Future Directions, SEW ‘09, Boulder, CO, pp. 100–5.CrossRefGoogle Scholar
Butnariu, C., Kim, S. N., Nakov, P., Ó Séaghdha, D., Szpakowicz, S., and Veale, T. 2010. SemEval-2 task 9: the interpretation of noun compounds using paraphrasing verbs and prepositions. In Proceedings of the 5th International Workshop on Semantic Evaluation, SemEval ‘10, Uppsala, Sweden, pp. 3944.Google Scholar
Butnariu, C. and Veale, T. 2008. A concept-centered approach to noun-compound interpretation. In Proceedings of the 22nd International Conference on Computational Linguistics, COLING ‘08, Manchester, UK, pp. 81–8.Google Scholar
Chomsky, N. and Halle, M. 1968. Sound Pattern of English. Cambridge, MA: MIT Press.Google Scholar
Chomsky, N., Halle, M. and Lukoff, F. 1956. On accent and juncture in English. In Halle, M. (ed.), For Roman Jakobson: Essays on the Occasion of His Sixtieth Birthday, pp. 6580. The Hague: Mouton.Google Scholar
Di Sciullo, A. M., and Williams, E. 1987. On the Definition of Word. Cambridge, MA: MIT Press.Google Scholar
Downing, P. 1977. On the creation and use of English compound nouns. Language 53 (4): 810–42.CrossRefGoogle Scholar
Fellbaum, C. (ed.) 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Finin, T. 1980. The Semantic Interpretation of Compound Nominals. PhD thesis, University of Illinois, Urbana, IL.Google Scholar
Girju, R. 2006. Out-of-context noun phrase semantic interpretation with cross-linguistic evidence. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, CIKM ‘06, Arlington, Virginia, pp. 268–76.CrossRefGoogle Scholar
Girju, R. 2007. Experiments with an annotation scheme for a knowledge-rich noun phrase interpretation system. In Proceedings of the Linguistic Annotation Workshop, LAW ‘07, Prague, Czech Republic, pp. 168–75.CrossRefGoogle Scholar
Girju, R., Badulescu, A. and Moldovan, D. 2003. Learning semantic constraints for the automatic discovery of part-whole relations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, NAACL ‘03, Edmonton, Canada, pp. 18.Google Scholar
Girju, R., Moldovan, D., Tatu, M., and Antohe, D. 2005. On the semantics of noun compounds. Journal of Computer Speech and Language – Special Issue on Multiword Expressions 4 (19): 479–96.CrossRefGoogle Scholar
Girju, R., Nakov, P., Nastase, V., Szpakowicz, S., Turney, P., and Yuret, D. 2007. SemEval-2007 task 04: classification of semantic relations between nominals. In Proceedings of the Fourth International Workshop on Semantic Evaluations, SemEval ‘07, Prague, Czech Republic, pp. 1318.Google Scholar
Girju, R., Nakov, P., Nastase, V., Szpakowicz, S., Turney, P., and Yuret, D. 2009. Classification of semantic relations between nominals. Language Resources and Evaluation 43 (2): 105–21.CrossRefGoogle Scholar
Hendrickx, I., Kim, S. N., Kozareva, Z., Nakov, P., Ó Séaghdha, D., Padó, S., Pennacchiotti, M., Romano, L., and Szpakowicz, S. 2009. SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals. In Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, SEW ‘09, Boulder, CO, pp. 94–9.CrossRefGoogle Scholar
Hendrickx, I., Kim, S. N., Kozareva, Z., Nakov, P., Ó Séaghdha, D., Padó, S., Pennacchiotti, M., Romano, L., and Szpakowicz, S. 2010. SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals. In Proceedings of the 5th International Workshop on Semantic Evaluation, SemEval ‘10, Uppsala, Sweden, pp. 33–8.Google Scholar
Hendrickx, I., Kozareva, Z., Nakov, P., Ó Séaghdha, D., Szpakowicz, S., and Veale, T. 2013. SemEval-2013 task 4: free paraphrases of noun compounds. In Proceedings of the International Workshop on Semantic Evaluation, SemEval ‘13, Atlanta, Georgia.Google Scholar
Hobbs, J. R., Stickel, M. E., Appelt, D. E., and Martin, P. 1993. Interpretation as abduction. Artificial Intelligence 63 (1–2): 69142.CrossRefGoogle Scholar
Huddleston, R. and Pullum, G. 2002. The Cambridge Grammar of the English Language. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Jackendoff, R. 1975. Morphological and semantic regularities in the lexicon. Language 51: 639–71.CrossRefGoogle Scholar
Jespersen, O. 1942. A Modern English Grammar on Historical Principles. Part VI: Morphology. Copenhagen: Ejaar Munksgaard.Google Scholar
Kim, S. N. and Baldwin, T. 2005. Automatic interpretation of compound nouns using WordNet similarity. In Proceedings of 2nd International Joint Conference on Natural Language Processing, IJCNLP ‘05, Jeju, Korea, pp. 945–56.Google Scholar
Kim, S. N. and Baldwin, T. 2006. Interpreting semantic relations in noun compounds via verb semantics. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics and 21st International Conference on Computational Linguistics, COLING-ACL ‘06, Sydney, Australia, pp. 491–8.Google Scholar
Kim, S. N. and Nakov, P. 2011. Large-scale noun compound interpretation using bootstrapping and the web as a corpus. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ‘11, Edinburgh, UK, pp. 648–58.Google Scholar
Koehn, P. 2005. Europarl: A parallel corpus for evaluation of machine translation. In Proceedings of the X MT Summit, Phuket, Thailand, pp. 7986.Google Scholar
Koehn, P., Och, F. J. and Marcu, D. 2003. Statistical phrase-based translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, HLT-NAACL ‘03, Edmonton, Canada, pp. 4854.Google Scholar
Kucera, H. and Francis, N. 1967. Computational Analysis of Present-Day American English. Providence, RI: Brown University Press.Google Scholar
Ladd, R. D. 1984. English compound stress. In Gibbon, D. and Richter, H. (eds.), Intonation, Accent and Rhythm: Studies in Discourse Phonology. Berlin: W de Gruyter.Google Scholar
Lapata, M. 2002. The disambiguation of nominalizations. Computational Linguistics 28 (3): 357–88.CrossRefGoogle Scholar
Lapata, M. and Keller, F. 2004. The Web as a baseline: evaluating the performance of unsupervised Web-based models for a range of NLP tasks. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, HLT-NAACL ‘04, Boston, MA, pp. 121–8.Google Scholar
Lapata, M. and Keller, F. 2005. Web-based models for natural language processing. ACM Transactions on Speech and Language Processing 2 (1): 3.CrossRefGoogle Scholar
Lauer, M. 1995. Designing Statistical Language Learners: Experiments on Noun Compounds. PhD thesis, Department of Computing, Macquarie University, Australia.Google Scholar
Levi, J. 1978. The Syntax and Semantics of Complex Nominals. New York: Academic Press.Google Scholar
Liberman, M. and Sproat, R. 1992. The stress and structure of modified noun phrases in English. In Sag, I. A. and Szabolcsi, A. (eds.), Lexical Matters, pp. 131–81. Stanford, CA: CSLI Publications.Google Scholar
Lieber, R. and Stekauer, P. (eds.) 2009. The Oxford Handbook of Compounding. Oxford Handbooks in Linguistics. Oxford: Oxford University Press.Google Scholar
Marcus, M. 1980. A Theory of Syntactic Recognition for Natural Language. Cambridge, MA: MIT Press.Google Scholar
Meyer, R. 1993. Compound Comprehension in Isolation and in Context: The Contribution of Conceptual and Discourse Knowledge to the Comprehension of German Novel Noun–Noun Compounds. Linguistische Arbeiten 299. Tübingen: Niemeyer.CrossRefGoogle Scholar
Mitchell, J. and Lapata, M. 2010. Composition in distributional models of semantics. Cognitive Science 34 (8): 1388–429.CrossRefGoogle ScholarPubMed
Moldovan, D., Badulescu, A., Tatu, M., Antohe, D., and Girju, R. 2004. Models for the semantic classification of noun phrases. In Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics, Boston, MA, pp. 60–7.CrossRefGoogle Scholar
Nakov, P. 2007. Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics. PhD thesis, EECS Department, University of California, Berkeley, UCB/EECS-2007-173.Google Scholar
Nakov, P. 2008a. Improved statistical machine translation using monolingual paraphrases. In Proceedings of the European Conference on Artificial Intelligence, ECAI ‘08, Patras, Greece, pp. 338–42.Google Scholar
Nakov, P. 2008b. Noun compound interpretation using paraphrasing verbs: Feasibility study. In Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications, AIMSA ‘08, Varna, Bulgaria, pp. 103–17.Google Scholar
Nakov, P. 2008c. Paraphrasing verbs for noun compound interpretation. In Proceedings of the LREC'08 Workshop: Towards a Shared Task for Multiword Expressions, MWE ‘08, Marrakech, Morocco, pp. 46–9.Google Scholar
Nakov, P. and Hearst, M. 2005a. Search engine statistics beyond the n-gram: application to noun compound bracketing. In Proceedings of the Ninth Conference on Computational Natural Language Learning, CoNLL ‘05, Ann Arbor, MI, pp. 1724.CrossRefGoogle Scholar
Nakov, P. and Hearst, M. 2005b. A study of using search engine page hits as a proxy for n-gram frequencies. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP ‘05, Borovets, Bulgaria, pp. 347–53.Google Scholar
Nakov, P. and Hearst, M. 2005c. Using the web as an implicit training set: application to structural ambiguity resolution. In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT-EMNLP ‘05, Vancouver, Canada, pp. 835–42.Google Scholar
Nakov, P. and Hearst, M. 2006. Using verbs to characterize noun–noun relations. In Euzenat, J. and Domingue, J. (eds.), AIMSA, Lecture Notes in Computer Science 4183, Springer, Berlin, Heidelberg, pp. 233–44.Google Scholar
Nakov, P. and Hearst, M. 2008. Solving relational similarity problems using the web as a corpus. In Proceedings of the 46th Annual Meeting on Association for Computational Linguistics, ACL ‘08, Columbus, OH, pp. 452–60.Google Scholar
Nakov, P. and Kozareva, Z. 2011. Combining relational and attributional similarity for semantic relation classification. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP ‘11, Hissar, Bulgaria, pp. 323–30.Google Scholar
Nastase, V., Nakov, P., Ó Séaghdha, D., and Szpakowicz, S. 2013. Semantic Relations between Nominals. Synthesis Lectures on Human Language Technologies. San Rafael, CA: Morgan & Claypool.CrossRefGoogle Scholar
Nastase, V., Sayyad-Shirabad, J., Sokolova, M., and Szpakowicz, S. 2006. Learning noun-modifier semantic relations with corpus-based and WordNet-based features. In Proceedings of the 21st National Conference on Artificial Intelligence, Boston, MA, pp. 781–7.Google Scholar
Nastase, V. and Szpakowicz, S. 2003. Exploring noun-modifier semantic relations. In Proceedings of the Fifth International Workshop on Computational Semantics, IWCS ‘03, Tilburg, Holland, pp. 285301.Google Scholar
Ó Séaghdha, D. 2007. Designing and evaluating a semantic annotation scheme for compound nouns. In Proceedings of the 4th Corpus Linguistics Conference, CL ‘07, Birmingham, UK.Google Scholar
Ó Séaghdha, D. 2008. Learning Compound Noun Semantics. PhD thesis, Computer Laboratory, University of Cambridge, Published as Computer Laboratory Technical Report 735.Google Scholar
Ó Séaghdha, D., and Copestake, A. 2008. Semantic classification with distributional kernels. In Proceedings of the 22nd International Conference on Computational Linguistics, COLING ‘08, Manchester, UK, pp. 649–56.Google Scholar
Ó Séaghdha, D., and Copestake, A. 2009. Using lexical and relational similarity to classify semantic relations. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, EACL ‘09, Athens, Greece, pp. 621–9.Google Scholar
Pedersen, T., Patwardhan, S. and Michelizzi, J. 2004. WordNet::Similarity: measuring the relatedness of concepts. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (Demonstration Papers), HLT-NAACL ‘04, Boston, MA, pp. 3841.Google Scholar
Pustejovsky, J. 1995. The Generative Lexicon. Cambridge, MA: MIT Press.Google Scholar
Pustejovsky, J., Anick, P. and Bergler, S. 1993. Lexical semantic techniques for corpus analysis. Computational Linguistics 19 (2): 331–58.Google Scholar
Resnik, P. 1993. Selection and Information: A Class-Based Approach to Lexical Relationships. PhD thesis, University of Pennsylvania, UMI Order No. GAX94-13894.Google Scholar
Rosario, B. and Hearst, M. 2001. Classifying the semantic relations in noun compounds via a domain-specific lexical hierarchy. In Lee, L. and Harman, D. (eds.), Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, EMNLP ‘01, Ithaca, NY, pp. 8290.Google Scholar
Rosario, B., Hearst, M. A. and Fillmore, C. 2002. The descent of hierarchy, and selection in relational semantics. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, ACL ‘02, Philadelphia, PA, pp. 247–54.Google Scholar
Socher, R., Huval, B., Manning, C. D. and Ng, A. Y. 2012. Semantic compositionality through recursive matrix–vector spaces. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL ‘12, Jeju, Korea, pp. 1201–11.Google Scholar
Spärck Jones, K. 1983. Compound noun interpretation problems. In Fallside, F. and Woods, W. A. (eds.), Computer Speech Processing, pp. 363–81. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
Spink, A., Wolfram, D., Jansen, M. B. J. and Saracevic, T. 2001. Searching the web: the public and their queries. Journal of the American Society for Information Science and Technology 52 (3): 226–34.3.0.CO;2-R>CrossRefGoogle Scholar
Tanaka, T. and Baldwin, T. 2003. Noun–noun compound machine translation: a feasibility study on shallow processing. In Proceedings of the ACL 2003 Workshop on Multiword Expressions, MWE ‘03, Sapporo, Japan, pp. 1724.Google Scholar
Trask, R. 1993. A Dictionary of Grammatical Terms in Linguistics. New York: Routledge.Google Scholar
Tratz, S. and Hovy, E. 2010. A taxonomy, dataset, and classifier for automatic noun compound interpretation. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL ‘10, Uppsala, Sweden, pp. 678–87.Google Scholar
Turney, P. 2006a. Expressing implicit semantic relations without supervision. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, COLING-ACL ‘06, Sydney, Australia, pp. 313–20.Google Scholar
Turney, P. 2006b. Similarity of semantic relations. Computational Linguistics 32 (3): 379416.CrossRefGoogle Scholar
Turney, P. and Littman, M. 2005. Corpus-based learning of analogies and semantic relations. Machine Learning Journal 60 (1–3): 251–78.CrossRefGoogle Scholar
Vanderwende, L. 1994. Algorithm for automatic interpretation of noun sequences. In Proceedings of the 15th conference on Computational linguistics, COLING ‘94, Kyoto, Japan, pp. 782–8.CrossRefGoogle Scholar
Warren, B. 1978. Semantic patterns of noun–noun compounds. In Gothenburg Studies in English 41. Goteburg: Acta Universtatis Gothoburgensis.Google Scholar