Interpreting compound nouns with kernel methods

DIARMUID Ó SÉAGHDHA; ANN COPESTAKE

doi:10.1017/S1351324912000368

Interpreting compound nouns with kernel methods

Published online by Cambridge University Press: 12 March 2013

DIARMUID Ó SÉAGHDHA and

ANN COPESTAKE

Show author details

DIARMUID Ó SÉAGHDHA: Affiliation:
Computer Laboratory, University of Cambridge, Cambridge, UK e-mail: do242@cam.ac.uk, ann.copestake@cl.cam.ac.uk
ANN COPESTAKE: Affiliation:
Computer Laboratory, University of Cambridge, Cambridge, UK e-mail: do242@cam.ac.uk, ann.copestake@cl.cam.ac.uk

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This paper presents a classification-based approach to noun–noun compound interpretation within the statistical learning framework of kernel methods. In this framework, the primary modelling task is to define measures of similarity between data items, formalised as kernel functions. We consider the different sources of information that are useful for understanding compounds and proceed to define kernels that compute similarity between compounds in terms of these sources. In particular, these kernels implement intuitive notions of lexical and relational similarity and can be computed using distributional information extracted from text corpora. We report performance on classification experiments with three semantic relation inventories at different levels of granularity, demonstrating in each case that combining lexical and relational information sources is beneficial and gives better performance than either source taken alone. The data used in our experiments are taken from general English text, but our methods are also applicable to other domains and potentially to other languages where noun–noun compounding is frequent and productive.

Information

Type: Articles
Information: Natural Language Engineering , Volume 19 , Special Issue 3: On the semantics of noun compounds , July 2013 , pp. 331 - 356

DOI: https://doi.org/10.1017/S1351324912000368 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2013

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

ACE 2008. Automatic Content Extraction 2008 Evaluation Plan. Available at http://www.itl.nist.gov/iad/mig/tests/ace/2008/doc/ace08-evalplan.v1.2d.pdf. Accessed 12 December 2012.Google Scholar

Agarwal, A. and Daumé, H. III 2011. Generative kernels for exponential families. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS-11), Ft. Lauderdale, FL.Google Scholar

Baldwin, T. and Tanaka, T. 2004. Translation by machine of complex nominals: getting it right. In Proceedings of the ACL-04 Workshop on Multiword Expressions: Integrating Processing, Barcelona, Spain.Google Scholar

Bauer, L. 2001. Compounding. In Haspelmath, M. (eds.), Language Typology and Language Universals. Hague, Netherlands: Mouton de Gruyter. 695–707.Google Scholar

Berg, C., Christensen, J. P. R. and Ressel, P. 1984. Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions. Berlin, Germany: Springer.CrossRef Google Scholar

Blei, David M., Ng, Andrew Y., and Jordan, Michael I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3: 993–1022.Google Scholar

Briscoe, T., Carroll, J. and Watson, R. 2006. The second release of the RASP system. In Proceedings of the ACL-06 Interactive Presentation Sessions, Sydney, Australia.Google Scholar

Burnard, L. 1995. Users' Guide for the British National Corpus. Oxford, UK: British National Corpus Consortium, Oxford University Computing Service.Google Scholar

Butnariu, C., Kim, Su N., Nakov, P., Ó Séaghdha, D., Szpakowicz, S., and Veale, T. 2010. Semeval-2010 task 9: the interpretation of noun compounds using paraphrasing verbs and prepositions. In Proceedings of the SemEval-2 Workshop, Uppsala, Sweden.Google Scholar

Clark, S., Copestake, A., Curran, James R., Zhang, Y., Herbelot, A., Haggerty, J., Ahn, B.-G., Wyk, C. Van, Roesner, J., Kummerfeld, J., and Dawborn, T. 2009. Large-scale syntactic processing: parsing the web. Technical report, final report of the 2009 JHU CLSP Workshop, Baltimore, MD.Google Scholar

Cortes, C., Mohri, M. and Rostamizadeh, A. 2010. Two-stage learning kernel algorithms. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel.Google Scholar

Cortes, C. and Vapnik, V. 1995. Support vector networks. Machine Learning 20 (3): 273–97.CrossRef Google Scholar

Curran, J. 2003. From Distributional to Semantic Similarity. PhD thesis, School of Informatics, University of Edinburgh, Edinburgh, UK.Google Scholar

Devereux, B. and Costello, F. 2005. Investigating the relations used in conceptual combination. Artificial Intelligence Review 24 (3–4): 489–515.CrossRef Google Scholar

Devereux, B. and Costello, F. 2007. Learning to interpret novel noun-noun compounds: evidence from a category learning experiment. In Proceedings of the ACL-07 Workshop on Cognitive Aspects of Computational Language Acquisition, Prague, Czech Republic.Google Scholar

Dietterich, Thomas G. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10 (7): 1895–923.Google Scholar PubMed

Estes, Z. and Jones, Lara L. 2006. Priming via relational similarity: a copper horse is faster when seen through a glass eye. Journal of Memory and Language 55 (1): 89–101.CrossRef Google Scholar

Gagné, Christina L. 2002. Lexical and relational influences on the processing of novel compounds. Brain and Language 81 (1–3): 723–35.CrossRef Google Scholar PubMed

Gagné, Christina L., and Shoben, Edward J. 1997. Influence of thematic relations on the comprehension of modifier-noun combinations. Journal of Experimental Psychology: Learning, Memory and Cognition 23 (1): 71–87.Google Scholar

Gagné, Christina L., and Shoben, Edward J. 2002. Priming relations in ambiguous noun-noun compounds. Memory and Cognition 30 (4): 637–46.CrossRef Google Scholar

Gärtner, T., Flach, Peter A., Kowalczyk, A., and Smola, Alex J. 2002. Multi-instance kernels. In Proceedings of the 19th International Conference on Machine Learning (ICML-02), Sydney, Australia.Google Scholar

Girju, R., Moldovan, D., Tatu, M. and Antohe, D. 2005. On the semantics of noun compounds. Computer Speech and Language 19 (4): 479–96.CrossRef Google Scholar

Girju, R., Nakov, P., Nastase, V., Szpakowicz, S., Turney, P., and Yuret, D. 2007. SemEval-2007 Task 04: classification of semantic relations between nominals. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-07), Prague, Czech Republic.Google Scholar

Graff, D., Kong, J., Chen, K. and Maeda, K. 2005. English Gigaword Corpus, 2nd ed.Philadelphia, PA: Linguistic Data Consortium.Google Scholar

Hein, M. and Bousquet, O. 2005. Hilbertian metrics and positive definite kernels on probability measures. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS-05), Barbados.Google Scholar

Joachims, T., Cristianini, N. and Shawe-Taylor, J. 2001. Composite kernels for hypertext categorisation. In Proceedings of the 18th International Conference on Machine Learning (ICML-01), Williamstown, MA.Google Scholar

Kim, Su N., and Baldwin, T. 2005. Automatic interpretation of noun compounds using WordNet similarity. In Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP-05), Jeju Island, Korea.Google Scholar

Lafferty, J. and Lebanon, G. 2005. Diffusion kernels on statistical manifolds. Journal of Machine Learning Research, 6: 129–63.Google Scholar

Lauer, M. 1995. Designing Statistical Language Learners: Experiments on Compound Nouns. PhD thesis, Macquarie University.Google Scholar

Lee, L. 1999. Measures of distributional similarity. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), College Park, MD.Google Scholar

Lin, D. 1999. Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), College Park, MD.Google Scholar

Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., and Watkins, Christopher J. C. H. 2002. Text classification using string kernels. Journal of Machine Learning Research, 2: 419–44.Google Scholar

Martins, André F. T., Smith, Noah A., Xing, Eric P., Aguiar, Pedro M. Q., and Figueiredo, Mário A. T. 2009. Nonextensive information theoretic kernels on measures. Journal of Machine Learning Research, 10: 935–75.Google Scholar

Mercer, J. 1909. Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London, Series A, 209: 415–46.Google Scholar

Nakov, P. 2008. Noun compound interpretation using paraphrasing verbs: Feasibility study. In Proceedings of the 13th International Conference on Artificial Intelligence: Methodology, Systems, Applications (AIMSA-08), Varna, Bulgaria.Google Scholar

Nakov, P. and Hearst, Marti A. 2008. Solving relational similarity problems using the web as a corpus. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-08: HLT), Columbus, OH.Google Scholar

Nastase, V., Shirabad, J. S., Sokolova, M. and Szpakowicz, S. 2006. Learning noun-modifier semantic relations with corpus-based and WordNet-based features. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), Boston, MA.Google Scholar

Nastase, V. and Szpakowicz, S. 2003. Exploring noun-modifier semantic relations. In Proceedings of the 5th International Workshop on Computational Semantics (IWCS-03), Tilburg, The Netherlands.Google Scholar

Ó Séaghdha, D. 2008. Learning Compound Noun Semantics. PhD thesis, University of Cambridge. Published as University of Cambridge Computer Laboratory Technical Report 735.Google Scholar

Ó Séaghdha, D., and Copestake, A. 2007. Co-occurrence contexts for noun compound interpretation. In Proceedings of the ACL-07 Workshop on A Broader Perspective on Multiword Expressions, Prague, Czech Republic.Google Scholar

Ó Séaghdha, D., and Copestake, A. 2008. Semantic classification with distributional kernels. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08), Manchester, UK.Google Scholar

Ó Séaghdha, D., and Copestake, A. 2009. Using lexical and relational similarity to classify semantic relations. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09), Athens, Greece.Google Scholar

Ó Séaghdha, D., and Korhonen, A. 2011. Probabilistic models of similarity in syntactic context. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP-11), Edinburgh, UK.Google Scholar

Padó, S. and Lapata, M. 2007. Dependency-based construction of semantic space models. Computational Linguistics, 33 (2): 161–99.CrossRef Google Scholar

Raffray, Claudine N., Pickering, Martin J., and Branigan, Holly P. 2007. Priming the interpretation of noun-noun compounds. Journal of Memory and Language, 57 (3): 380–95.CrossRef Google Scholar

Russell, S. W. 1972. Semantic categories of nominals for conceptual dependency analysis of natural language. Computer Science Department Report CS-299, Stanford University.Google Scholar

Ryder, M. E. 1994. Ordered Chaos: The Interpretation of English Noun-Noun Compounds. Berkeley, CA: University of California Press.Google Scholar

Shawe-Taylor, J., and Cristianini, N. 2004. Kernel Methods for Pattern Analysis., Cambridge: Cambridge University Press.CrossRef Google Scholar

Su, Stanley Y. W. 1969. A semantic theory based upon interactive meaning. Computer Sciences Technical Report #68, University of Wisconsin.Google Scholar

Tratz, S. and Hovy, E. 2010. A taxonomy, dataset and classifier for automatic noun compound interpretation. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL-10), Uppsala, Sweden.Google Scholar

Turney, Peter D. 2006. Similarity of semantic relations. Computational Linguistics, 32 (3): 379–416.CrossRef Google Scholar

Turney, Peter D. 2008. A uniform approach to analogies, synonyms, antonyms, and associations. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08), Manchester, UK.Google Scholar

Turney, Peter D., and Pantel, P. 2010. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37: 141–88.CrossRef Google Scholar

Yao, L., Mimno, D. and McCallum, A. 2009. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-09), Paris, France.Google Scholar

Article contents

Interpreting compound nouns with kernel methods

Abstract

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests