Hostname: page-component-78c5997874-94fs2 Total loading time: 0 Render date: 2024-11-14T06:19:04.176Z Has data issue: false hasContentIssue false

A fast method for statistical grammar induction

Published online by Cambridge University Press:  01 September 1998

WIDE R. HOGENHOUT
Affiliation:
Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-01, Japan
YUJI MATSUMOTO
Affiliation:
Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-01, Japan

Abstract

The statistical induction of stochastic context free grammars from bracketed corpora with the Inside Outside Algorithm is an appealing method for grammar learning, but the computational complexity of this algorithm has made it impossible to generate a large scale grammar. Researchers from natural language processing and speech recognition have suggested various methods to reduce the computational complexity and, at the same time, guide the learning algorithm towards a solution by, for example, placing constraints on the grammar. We suggest a method that strongly reduces that computational cost of the algorithm without placing constraints on the grammar. This method can in principle be combined with any of the constraints on grammars that have been suggested in earlier studies. We show that it is feasible to achieve results equivalent to earlier research, but with much lower computational effort. After creating a small grammar, the grammar is incrementally increased while rules that have become obsolete are removed at the same time. We explain the modifications to the algorithm, give results of experiments and compare these to results reported in other publications.

Type
Research Article
Copyright
© 1998 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)