Hostname: page-component-cd9895bd7-gvvz8 Total loading time: 0 Render date: 2024-12-26T09:12:01.332Z Has data issue: false hasContentIssue false

High speed feature unification and parsing

Published online by Cambridge University Press:  12 September 2008

John C. Brown
Affiliation:
Department of Computing, University of Bradford, Bradford, West Yorkshire BD7 1DP, UK email: j.c.brown@comp.brad.ac.uk

Abstract

Feature unification in parsing has previously used either inefficient Prolog programs, or LISP programs implementing early pre-WAM Prolog models of unification involving searches of binding lists, and the copying of rules to generate edges: features within rules and edges have traditionally been expressed as lists or functions, with clarity being preferred to speed of processing. As a result, parsing takes about 0·5 seconds for a 7-word sentence. Our earlier work produced an optimised chart parser for a non-unification context-free-grammar that achieved 5 ms parses, with high-ambiguity sentences involving hundreds of edges, using the grammar and sentences from Tomita's work on shift-reduce parsing with multiple stack branches. A parallel logic card design resulted that would speed this by a further factor of at least 17. The current paper extends this parser to treat a much more complex unification grammar with structures, using extensive indexing of rules and edges and the optimisations of top-down filtering and look-ahead, to demonstrate where unification occurs during parsing. Unification in parsing is distinguished from that in Prolog, and four alternative schemes for storing features and performing unification are considered, including the traditional binding-list method and three other methods optimised for speed for which overall unification times are calculated. Parallelisation of unification using cheap logic hardware is considered, and estimates show that unification will negligibly increase the parse time of our parallel parser card. Preliminary results are reported from a prototype serial parser that uses the fourth most efficient unification method, and achieves 7 ms for 7-word sentences, and under 1 s for a 36-word 360-way ambiguous sentence with 10,000 edges, on a conventional workstation.

Type
Articles
Copyright
Copyright © Cambridge University Press 1995

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ait-Kaci, H., (1991) Warren's Abstract Machine. Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Ait-Kaci, H., Boyer, R., Lincoln, P., and Nasr, R.. (1989) Efficient implementation of lattice operations. ACM Transactions on Programming Languages and Systems, 11 (1), January: 115146.CrossRefGoogle Scholar
Andrews, N. A., and Brown, J. C.. (1993/1994) A high-speed natural-language parser. AISB Quarterly Winter: 1219.Google Scholar
Andrews, N. A., and Brown, J. C.. (1996) A bitwise approach to parsing with a unification grammar. In preparation.Google Scholar
Brown, J. C., and Andrews, N.. (1993 a) Parallel natural language parsing using multiple broadcasting. In 6th Irish Conference on AI and Cognitive Science, Belfast. pp. 119129.Google Scholar
Brown, J. C., and Andrews, N., (1993 b) Parallel natural language parsing using multiple broadcasting. The Irish Journal of Psychology 14 (3): 503504.CrossRefGoogle Scholar
Brown, J.C, and Andrews, N.. (1993 c) A Cheap Accelerator Card for Parallelising Natural Language Parsing. Department of Computing, University of Bradford Research Report CS-26–93, November.Google Scholar
Carpenter, B., (1992) The Logic of Typed Feature Structures. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Carpenter, B., and Penn, G.. (1994) ALE-The Attribute Logic Engine: User's Guide Version 2.0. Report from Computational Linguistics Program, Philosophy Department, Carnegie Mellon University.Google Scholar
Carroll, J. A.. (1993) Practical Unification-Based Parsing of Natural Language. Technical Report No.314, Computer Laboratory, University of Cambridge, October.Google Scholar
Carroll, J.. (1994) Relating complexity to practical performance in parsing with wide-coverage unification grammars. In Proceedings of 32nd Meeting of the Association for Computational Linguistics, Las Cruces, NM. pp. 287294.CrossRefGoogle Scholar
Clocksin, W., and Mellish, C.. (1984) Programming in Prolog. Berlin, Germany: Springer-Verlag.Google Scholar
Covington, M. A., (1994) Natural Language Processing for Prolog Programmers. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
Gazdar, G., and Mellish, C.. (1989) Natural Language Processing in Prolog. Wokingham, UK: Addison-Wesley.Google Scholar
Gazdar, G., Klein, E., Pullum, G., and Sag, I.. (1985) Generalised Phrase Structure Grammar. Oxford: Basil Blackwell.Google Scholar
Grover, C., Briscoe, T., Carroll, J., and Boguraev, B.. (1989) The Alvey Natural Language Tools Project Grammar (Second Release): A Large Computational Grammar of English. Technical Report No. 162, 1989, Computer Laboratory, University of Cambridge.Google Scholar
Grover, C., Carroll, J., and Briscoe, T.. (1993) The Alvey Natural Language Tools Grammar (4th Release). Technical Report, 1993, Computer Laboratory, University of Cambridge.Google Scholar
Kay, M., (1973), The MIND System. In Rustin, , (ed.) Natural Language Processing. New York: Algorithmic Press.Google Scholar
Kilbury, J.. (1985) Chart parsing and the Earley algorithm. In Klenke, (ed.) Kotextfrei Syntaxen und verwandte Systeme. Tubingen, Germany: Max Neimeyer. Pp. 7690.CrossRefGoogle Scholar
Maier, D., and Warren, D. S., (1988) Computing with Logic: Logic Programming with Prolog. San Francis co, CA: Benjamin/Cummings.Google Scholar
Pereira, F. C. N., (1985) A structure-sharing representation for unification-based grammar formalisms. In Proceedings of 23rd. Annual Meeting of the Association for Computational Linguistics, Chicago. Pp. 137144.CrossRefGoogle Scholar
Phillips, J. D., (1986) A simple efficient parser for phrase-structure grammars. AISB Quarterly 59: 1418.Google Scholar
Pollard, C., and Sag, L. A.. (1987) Information-Based Syntax and Semantics: Vol.1, Fundamentals. CLSI Lecture Notes No. 13, Stanford University.Google Scholar
Shieber, SM, (1986) An Introduction to Unification-Based Approaches to Grammar. CLSI Lecture Notes No. 4, Stanford University.Google Scholar
Shieber, S. M., (1992) Constraint-Based Grammar Formalisms. Cambridge, MA: MIT Press.Google Scholar
Tomabechi, H., (1991) Quasi-Destructive Graph Unification. In Proceedings of 29th. Annual Meeting of the Association for Computational Linguistics, Berkeley. Pp. 315322.CrossRefGoogle Scholar
Tomita, M., (1986) Efficient Parsing for Natural Language. Boston, MA: Kluwer Academic.CrossRefGoogle Scholar
Warren, D. H. D, (1983) An Abstract Prolog Instruction Set. Tech. Note 309, SRI.Google Scholar
Wiren, M., (1987) A comparison of rule-invocation strategies in context-free chart parsing. In Proceedings of the Third European Conference on ACL. Pp. 226.CrossRefGoogle Scholar