Hostname: page-component-cd9895bd7-hc48f Total loading time: 0 Render date: 2024-12-26T09:39:33.530Z Has data issue: false hasContentIssue false

A general feature space for automatic verb classification

Published online by Cambridge University Press:  01 July 2008

ERIC JOANIS*
Affiliation:
Department of Computer Science, University of Toronto, 6 King's College Road, Toronto, Ontario, Canada, M5S 3H5 e-mail: joanis@cs.toronto.edu, suzanne@cs.toronto.edu, james@cs.toronto.edu
SUZANNE STEVENSON
Affiliation:
Department of Computer Science, University of Toronto, 6 King's College Road, Toronto, Ontario, Canada, M5S 3H5 e-mail: joanis@cs.toronto.edu, suzanne@cs.toronto.edu, james@cs.toronto.edu
DAVID JAMES
Affiliation:
Department of Computer Science, University of Toronto, 6 King's College Road, Toronto, Ontario, Canada, M5S 3H5 e-mail: joanis@cs.toronto.edu, suzanne@cs.toronto.edu, james@cs.toronto.edu
*
Current affiliation: Interactive Language Technologies Group, Institute for Information Technology, National Research Council Canada, A1330-101 St-Jean-Bosco Street, Gatineau, Quebec, CanadaJ8Y 3G5.

Abstract

Lexical semantic classes of verbs play an important role in structuring complex predicate information in a lexicon, thereby avoiding redundancy and enabling generalizations across semantically similar verbs with respect to their usage. Such classes, however, require many person-years of expert effort to create manually, and methods are needed for automatically assigning verbs to appropriate classes. In this work, we develop and evaluate a feature space to support the automatic assignment of verbs into a well-known lexical semantic classification that is frequently used in natural language processing. The feature space is general – applicable to any class distinctions within the target classification; broad – tapping into a variety of semantic features of the classes; and inexpensive – requiring no more than a POS tagger and chunker. We perform experiments using support vector machines (SVMs) with the proposed feature space, demonstrating a reduction in error rate ranging from 48% to 88% over a chance baseline accuracy, across classification tasks of varying difficulty. In particular, we attain performance comparable to or better than that of feature sets manually selected for the particular tasks. Our results show that the approach is generally applicable, and reduces the need for resource-intensive linguistic analysis for each new classification task. We also perform a wide range of experiments to determine the most informative features in the feature space, finding that simple, easily extractable features suffice for good verb classification performance.

Type
Papers
Copyright
Copyright © Cambridge University Press 2006

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abney, S. (1991) Parsing by chunks. In: Berwick, R., Abney, S. and Tenny, C. (eds.), Principle-Based Parsing. Kluwer Academic.Google Scholar
Aone, C. and McKee, D. (1996) Acquiring predicate-argument mapping information in multilingual texts. In: Boguraev, B. and Pustejovsky, J. (eds.), Corpus Processing for Lexical Acquisition, pp. 191202. MIT Press.Google Scholar
Baker, C. F., Fillmore, C. J. and Lowe, J. B. (1998) The Berkeley FrameNet Project. Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING/ACL-1998), pp. 86–90.Google Scholar
Brent, M. (1993) From grammar to lexicon: Unsupervised learning of lexical syntax. Computational Linguistics, 19 (3): 243262.Google Scholar
Briscoe, T. and Carroll, J. (1993) Generalised probabilistic LR parsing of natural language (corpora) with unification-based grammars. Computational Linguistics, 19 (1): 2560.Google Scholar
Briscoe, T. and Carroll, J. (1997) Automatic extraction of subcategorization from corpora. Proceedings of the Fifth ACL Conference on Applied Natural Language Processing (ANLP-97), pp. 356–363, Washington, DC.CrossRefGoogle Scholar
Burnard, L. (ed.) (2000) British National Corpus User Reference Guide. URL: http://www.natcorp.ox.ac.uk/World/HTML/urg.html.Google Scholar
Chang, C.-C. and Lin, C.-J. (2001) LIBSVM: A library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google Scholar
Dorr, B. J. and Jones, D. (1996) Role of word sense disambiguation in lexical acquisition: Predicting semantics from syntactic cues. Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), pp. 322–327, Copenhagen, Denmark.CrossRefGoogle Scholar
Dowty, D. R. (1991) Thematic proto-roles and argument selection. Language, 67 (3): 547619.CrossRefGoogle Scholar
Erk, K., Melinger, A. and Schulte im Walde, S. (eds.) (2005) Proceedings of the Interdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes. Saarbrücken, Germany.Google Scholar
Gildea, D. (2002) Probabilistic models of verb-argument structure. Proceedings of the 19th International Conference on Computational Linguistics (COLING-2002), pp. 308–314, Taipei, Taiwan.CrossRefGoogle Scholar
Gildea, D. and Jurafsky, D. (2002) Automatic labeling of semantic roles. Computational Linguistics, 28 (3): 245288.CrossRefGoogle Scholar
Girju, R., Roth, D. and Sammons, M. (2005) Token-level disambiguation of verbnet classes. (Erk et al. 2005), pp. 56–61.Google Scholar
Habash, N., Dorr, B. J. and Traum, D. (2003) Hybrid natural language generation from lexical conceptual structures. Machine Translation, 18 (2): 81128.CrossRefGoogle Scholar
Hsu, C.-W., Chang, C.-C. and Lin, C.-J. (2003) A practical guide to support vector classification, July. URL: http://www.csie.ntu.edu.tw/~cjlin/libsvm/.Google Scholar
Hsu, C.-W. and Lin, C.-J. (2002) A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13 (2): 415425.Google Scholar
Iglewicz, B. (1983) Robust scale estimators and confidence intervals for location. In: Hoaglin, D. C., Mosteller, M. and Tukey, J. W. (eds.), Understanding Robust and Exploratory Data Analysis. Wiley.Google Scholar
Joanis, E. (2002) Automatic verb classification using a general feature space. Master's thesis, Department of Computer Science, University of Toronto.Google Scholar
Joanis, E. and Stevenson, S. (2003) A general feature space for automatic verb classification. Proceedings of the Tenth Conference of the European Chapter of the Association for Computational Linguistics (EACL-03), pp. 163–170, Budapest, Hungary.CrossRefGoogle Scholar
Kipper, K., Dang, H. T. and Palmer, M. (2000) Class based construction of a verb lexicon. Proceedings of the 17th National Conference on Artificial Intelligence (AAAI-2000), Austin, TX.Google Scholar
Kipper, K., Korhonen, A., Ryant, N. and Palmer, M. (2006) A large-scale extension of VerbNet with novel verb classes. Proceedings of the 12th EURALEX International Congress, Turin, Italy.Google Scholar
Korhonen, A. (2002) Semantically motivated subcategorization acquisition. Proceedings of the ACL Workshop on Unsupervised Lexical Acquisition, pp. 51–58.CrossRefGoogle Scholar
Korhonen, A. and Briscoe, T. (2004) Extended lexical-semantic classification of english verbs. Proceedings of the HLT/NAACL Workshop on Computational Lexical Semantics, pp. 38–45.CrossRefGoogle Scholar
Lapata, M. and Brew, C. (1999) Using subcategorization to resolve verb class ambiguity. In: Fung, P. and Zhou, J. (eds.), Joint SIGDAT Conference on Empirical Methods in NLP and Very Large Corpora (EMNLP/VLC-99), pp. 266–274.Google Scholar
Lapata, M. and Brew, C. (2004) Verb class disambiguation using informative priors. Computational Linguistics, 30 (1): 4573.CrossRefGoogle Scholar
Levin, B. (1993) English verb classes and alternations: A preliminary investigation. University of Chicago Press.Google Scholar
Mayol, L., Boleda, G. and Badia, T. (2005) Automatic learning of syntactic verb classes. (Erk et al. 2005), pp. 92–97.Google Scholar
McCarthy, D. (2000) Using semantic preferences to identify verbal participation in role switching alternations. Proceedings of the First Conference of the North American Chapter of the ACL (NAACL-2000), pp. 256–263, Seattle, WA.Google Scholar
McCarthy, D., Koeling, R., Weeds, J. and Carroll, J. (2004) Finding predominant senses in untagged text. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-2004), pp. 280–287, Barcelona, Spain.CrossRefGoogle Scholar
Merlo, P. and Stevenson, S. (2001) Automatic verb classification based on statistical distributions of argument structure. Computational Linguistics, 27 (3): 373408.CrossRefGoogle Scholar
Merlo, P., Stevenson, S., Tsang, V. and Allaria, G. (2002) A multilingual paradigm for automatic verb classification. Proceedings of the 40th Annual Meeting of the ACL, pp. 207–214, Philadelphia, PA.CrossRefGoogle Scholar
Oishi, A. and Matsumoto, Y. (1997) Detecting the organization of semantic subclasses of Japanese verbs. Int. J. Corpus Linguistics, 2 (1): 6589.CrossRefGoogle Scholar
Palmer, M., Gildea, D. and Kingsbury, P. (2005) The Proposition Bank: An annotated corpus of semantic roles. Computational Linguistics, 31 (1): 71106.CrossRefGoogle Scholar
Pinker, S. (1989) Learnability and cognition: the acquisition of argument structure. MIT Press.Google Scholar
Resnik, P. (1996) Selectional constraints: an information-theoretic model and its computational realization. Cognition, 61 (1–2): 127159.CrossRefGoogle ScholarPubMed
Rifkin, R. and Klautau, A. (2004) In defense of one-vs-all classification. J. Machine Learning Res. 5 (Jan): 101141.Google Scholar
Riloff, E. and Schmelzenbach, M. (1998) An empirical approach to conceptual case frame acquisition. Proceedings of the Sixth Workshop on Very Large Corpora (WVLC-98), pp. 49–56, Montreal, Canada.Google Scholar
Rohde, D. L. T. (2002) TGrep2 user manual version 1.3. Available with the TGrep2 package at http://tedlab.mit.edu/~dr/Tgrep2/.Google Scholar
Rooth, M., Riezler, S., Prescher, D., Carroll, G. and Beil, F. (1999) Inducing a semantically annotated lexicon via EM-based clustering. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 104–111, College Park, MD.CrossRefGoogle Scholar
Sarkar, A. and Tripasai, W. (2002) Learning verb argument structure from minimally annotated corpora. Proceedings of the 19th International Conference on Computational Linguistics (COLING-2002), pp. 864–869, Taipei, Taiwan.CrossRefGoogle Scholar
Sarle, W. S. (2002) Should I nonlinearly transform the data? Neural Network FAQ, part 2 of 7: Learning. Periodic posting to the Usenet newsgroup comp.ai.neural-nets, URL: ftp://ftp.sas.com/pub/neural/FAQ.html.Google Scholar
Schulte im Walde, S. (2000) Clustering verbs semantically according to their alternation behaviour. Proceedings of the 18th International Conference on Computational Linguistics (COLING-2000), pp. 747–753, Saarbrücken, Germany.CrossRefGoogle Scholar
Schulte im Walde, S. (2003) Experiments on the choice of features for learning verb classes. Proceedings of the Tenth Conference of the European Chapter of the Association for Computational Linguistics (EACL-2003), pp. 315–322, Budapest, Hungary.CrossRefGoogle Scholar
Schulte im Walde, S. and Brew, C. (2002) Inducing German semantic verb classes from purely syntactic subcategorisation information. Proceedings of the 40th Annual Meeting of the ACL, pp. 223–230, Philadelphia, PA.CrossRefGoogle Scholar
Shi, L. and Mihalcea, R. (2005) Putting pieces together: Combining FrameNet, VerbNet and WordNet for robust semantic parsing. In: Gelbukh, A. (ed.), Computational Linguistics and Intelligent Text Processing; Sixth International Conference, CICLing 2005, Proceedings, Lecture Notes in Computer Science, vol 3406, pp. 100–111, Mexico City, Mexico.Google Scholar
Stevenson, S. and Joanis, E. (2003) Semi-supervised verb class discovery using noisy features. Proceedings of the Seventh Conference on Natural Language Learning (CoNLL-2003), pp. 71–78, Edmonton, Canada.CrossRefGoogle Scholar
Stevenson, S. and Merlo, P. (1999) Automatic verb classification using grammatical features. Proceedings of the Ninth Conference of the European Chapter of the Association for Computational Linguistics (EACL-99), pp. 45–52, Bergen, Norway.Google Scholar
Stevenson, S., Merlo, P., Kariaeva, N. and Whitehouse, K. (1999) Supervised learning of lexical semantic verb classes using frequency distributions. Proceedings of SigLex99: Standardizing Lexical Resources, pp. 15–22, College Park, MD.Google Scholar
Swier, R. and Stevenson, S. (2004) Unsupervised semantic role labelling. Proceedings of the 2004 Conference on Emperical Methods in Natural Language Processing, pp. 95–102, Barcelona, Spain.Google Scholar
Swift, M. (2005) Towards automatic verb acquisition from VerbNet for spoken dialog processing. (Erk et al. 2005), pp. 115–120.Google Scholar
Tsang, V. and Stevenson, S. (2004) Calculating semantic distance between word sense probability distributions. Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004), pp. 81–88, Boston, MA.Google Scholar
Villavicencio, A. (2005) The availability of verb-particle constructions in lexical resources: How much is enough? Computer Speech and Language, Special Issue on Multiword Expressions, 19 (4): 415432.CrossRefGoogle Scholar