Natural Language Engineering: Volume 2 -

The Japanese lexical transducer based on stem-suffix style forms
MASAKAZU TATENO, HIROSHI MASUICHI, HIROSHI UMEMOTO
Published online by Cambridge University Press:

01 December 1996, pp. 329-330
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A Lexical Transducer (LT) as defined by Karttunen, Kaplan, Zaenen 1992 is a specialized finite state transducer (FST) that relates citation forms of words and their morphological categories to inflected surface forms. Using LTs is advantageous because the same structure and algorithms can be used for morphological analysis (stemming) and generation. Morphological processing (analysis and generation) is computationally faster, and the data for the process can be compacted more tightly than with other methods. The standard way to construct an LT consists of three steps: (1) constructing a simple finite state source lexicon LA which defines all valid canonical citation forms of the language; (2) describing morphological alternations by means of two-level rules, compiling the rules to FSTs, and intersecting them to form a single rule transducer RT; and (3) composing LA and RT.

John Nerbonne, Klaus Netter and Carl Pollard (editors), German in Head-Driven Phrase Structure Grammar. Stanford, CA: CSLI, 1994. ISBN 1 881 52630 5, £16.95/US$21.95 (paperback). £40.00/US$49.95 (hardback).
SHULY WINTNER
Published online by Cambridge University Press:

01 March 1996, pp. 81-93
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Maureen Caudill, In Our Own Image: Building an Artificial Person. Oxford University Press, 1992. ISBN 0-19-508672-4 (paperback). ISBN 0-19-507338-X $13.95 (hardback). 242 pp.
ANDY BREEN
Published online by Cambridge University Press:

01 September 1996, pp. 277-285
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Finite state morphology and information retrieval
KIMMO KOSKENNIEMI
Published online by Cambridge University Press:

01 December 1996, pp. 331-336
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A source of potential systematic errors in information retrieval is identified and discussed. These errors occur when base form reduction is applied with a (necessarily) finite dictionary. Formal methods for avoiding this error source are presented, along with some practical complexities met in its implementation.

Madeleine Bates and Ralph M. Weischedel (editors), Challenges in Natural Language Processing. Cambridge: Cambridge University Press, 1993. ISBN 0521 41015 0, US$54.95 (hardback). 307 pp.
CHRISTINE DORAN, B. SRINIVAS
Published online by Cambridge University Press:

01 March 1996, pp. 81-93
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Donald E. Walker, Antonio Zampolli and Nicoletta Calzolari, (editors), Automating the Lexicon: Research and Practice in a Multilingual Environment. Oxford: Oxford University Press, 1995. ISBN 0 19 823950 5, £50.00 (hardback). xi + 413 pp.
ARTURO TRUJILLO
Published online by Cambridge University Press:

01 September 1996, pp. 277-285
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Partial parsing via finite-state cascades
STEVEN ABNEY
Published online by Cambridge University Press:

01 December 1996, pp. 337-344
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Finite state cascades represent an attractive architecture for parsing unrestricted text. Deterministic parsers specified by finite state cascades are fast and reliable. They can be extended at modest cost to construct parse trees with finite feature structures. Finally, such deterministic parsers do not necessarily involve trading off accuracy against speed — they may in fact be more accurate than exhaustive search stochastic context free parsers.

F. R. Palmer, Grammatical Roles and Relations. Cambridge: Cambridge University Press, 1994. ISBN 0 521 45836 6, £12.95 (paperback).
LAURA WAGNER
Published online by Cambridge University Press:

01 March 1996, pp. 81-93
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Transducer parsing of free and frozen sentences
EMMANUEL ROCHE
Published online by Cambridge University Press:

01 December 1996, pp. 345-350
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In language processing, finite state models are not a lesser evil that bring simplicity and efficiency at the cost of accuracy. On the contrary, they provide a very natural framework to describe complex linguistic phenomena. We present here one aspect of parsing with finite state transducers and show that this technique can be applied to complex linguistic situations.

Text and speech translation by means of subsequential transducers
J. M. VILAR, V. M. JIMÉNEZ, J. C. AMENGUAL, A. CASTELLANOS, D. LLORENS, E. VIDAL
Published online by Cambridge University Press:

01 December 1996, pp. 351-354
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The full paper explores the possibility of using Subsequential Transducers (SST), a finite state model, in limited domain translation tasks, both for text and speech input. A distinctive advantage of SSTs is that they can be efficiently learned from sets of input-output examples by means of OSTIA, the Onward Subsequential Transducer Inference Algorithm (Oncina et al. 1993). In this work a technique is proposed to increase the performance of OSTIA by reducing the asynchrony between the input and output sentences, the use of error correcting parsing to increase the robustness of the models is explored, and an integrated architecture for speech input translation by means of SSTs is described.

Finite state segmentation of discourse into clauses
EVA EJERHED
Published online by Cambridge University Press:

01 December 1996, pp. 355-364
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The paper presents background and motivation for a processing model that segments discourse into units that are simple, non-nested clauses, prior to the recognition of clause internal phrasal constituents, and experimental results in support of this model. One set of results is derived from a statistical reanalysis of the Swedish empirical data in Strangert, Ejerhed and Huber 1993 concerning the linguistic structure of major prosodic units. The other set of results is derived from experiments in segmenting part of speech annotated Swedish text corpora into clauses, using a new clause segmentation algorithm. The clause segmented corpus data is taken from the Stockholm Umeå Corpus (SUC), 1 M words of Swedish texts from different genres, part of speech annotated by hand, and from the Umeå corpus DAGENS INDUSTRI 1993 (DI93), 5 M words of Swedish financial newspaper text, processed by fully automatic means consisting of tokenizing, lexical analysis, and probabilistic POS tagging. The results of these two experiments show that the proposed clause segmentation algorithm is 96% correct when applied to manually tagged text, and 91% correct when applied to probabilistically tagged text.

Between finite state and Prolog: constraint-based automata for efficient recognition of phrases
KLAUS U. SCHULZ, TOMEK MIKO ŁAJEWSKI
Published online by Cambridge University Press:

01 December 1996, pp. 365-366
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In computational linguistics, efficient recognition of phrases is an important prerequisite for many ambitious goals, such as automated extraction of terminology, part of speech disambiguation, and automated translation. If one wants to recognize a certain well-defined set of phrases, the question of which type of computational device to use for this task arises. For sets of phrases that are not too complex, as well as for many subtasks of the recognition process, finite state methods are appropriate and favourable because of their efficiency Gross and Perrin 1989; Silberztein 1993; Tapanainen 1995. However, if very large sets of possibly complex phrases are considered where correct resolution of grammatical structure requires morphological analysis (e.g. verb argument structure, extraposition of relative clauses, etc.), then the design and implementation of an appropriate finite state automaton might turn out to be infeasible in practice due to the immense number of morphological variants to be captured.

Explanation-based learning and finite state transducers: applications to parsing lexicalized tree adjoining grammars
B. SRINIVAS
Published online by Cambridge University Press:

01 December 1996, pp. 367-368
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
There are currently two philosophies for building grammars and parsers: hand-crafted, wide coverage grammars; and statistically induced grammars and parsers. Aside from the methodological differences in grammar construction, the linguistic knowledge which is overt in the rules of handcrafted grammars is hidden in the statistics derived by probabilistic methods, which means that generalizations are also hidden and the full training process must be repeated for each domain. Although handcrafted wide coverage grammars are portable, they can be made more efficient when applied to limited domains, if it is recognized that language in limited domains is usually well constrained and certain linguistic constructions are more frequent than others. We view a domain-independent grammar as a repository of portable grammatical structures whose combinations are to be specialized for a given domain. We use Explanation-Based Learning (EBL) to identify the relevant subset of a handcrafted general purpose grammar (XTAG) needed to parse in a given domain (ATIS). We exploit the key properties of Lexicalized Tree-Adjoining Grammars to view parsing in a limited domain as finite state transduction from strings to their dependency structures.

Multilingual text analysis for text-to-speech synthesis
RICHARD SPROAT
Published online by Cambridge University Press:

01 December 1996, pp. 369-380
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We present a model of text analysis for text-to-speech (TTS) synthesis based on (weighted) finite state transducers, which serves as the text analysis module of the multilingual Bell Labs TTS system. The transducers are constructed using a lexical toolkit that allows declarative descriptions of lexicons, morphological rules, numeral-expansion rules, and phonological rules, inter alia. To date, the model has been applied to eight languages: Spanish, Italian, Romanian, French, German, Russian, Mandarin and Japanese.

An innovative finite state concept for recognition and parsing of context-free languages
M. J. NEDERHOF, E. BERTSCH
Published online by Cambridge University Press:

01 December 1996, pp. 381-382
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In the full paper in the companion volume, we introduce a new subclass of the context free languages, the meta-deterministic languages, which includes the deterministic languages, but also the languages that result if deterministic languages are combined via regular expressions.

Natural Language Engineering

Refine listing

Actions for selected content:

Volume 2 - March 1996

Research Article

The Japanese lexical transducer based on stem-suffix style forms

Book Review

John Nerbonne, Klaus Netter and Carl Pollard (editors), German in Head-Driven Phrase Structure Grammar. Stanford, CA: CSLI, 1994. ISBN 1 881 52630 5, £16.95/US$21.95 (paperback). £40.00/US$49.95 (hardback).

Maureen Caudill, In Our Own Image: Building an Artificial Person. Oxford University Press, 1992. ISBN 0-19-508672-4 (paperback). ISBN 0-19-507338-X $13.95 (hardback). 242 pp.

Research Article

Finite state morphology and information retrieval

Book Review

Madeleine Bates and Ralph M. Weischedel (editors), Challenges in Natural Language Processing. Cambridge: Cambridge University Press, 1993. ISBN 0521 41015 0, US$54.95 (hardback). 307 pp.

Donald E. Walker, Antonio Zampolli and Nicoletta Calzolari, (editors), Automating the Lexicon: Research and Practice in a Multilingual Environment. Oxford: Oxford University Press, 1995. ISBN 0 19 823950 5, £50.00 (hardback). xi + 413 pp.

Research Article

Partial parsing via finite-state cascades

Book Review

F. R. Palmer, Grammatical Roles and Relations. Cambridge: Cambridge University Press, 1994. ISBN 0 521 45836 6, £12.95 (paperback).

Research Article

Transducer parsing of free and frozen sentences

Text and speech translation by means of subsequential transducers

Finite state segmentation of discourse into clauses

Between finite state and Prolog: constraint-based automata for efficient recognition of phrases

Explanation-based learning and finite state transducers: applications to parsing lexicalized tree adjoining grammars

Multilingual text analysis for text-to-speech synthesis

An innovative finite state concept for recognition and parsing of context-free languages

Natural Language Engineering

Refine listing

Actions for selected content:

Save Search

Volume 2 - March 1996

Research Article

Book Review

Research Article

Book Review

Research Article

Book Review

Research Article