Hostname: page-component-cd9895bd7-fscjk Total loading time: 0 Render date: 2024-12-26T08:21:39.824Z Has data issue: false hasContentIssue false

ARIES: A lexical platform for engineering Spanish processing tools

Published online by Cambridge University Press:  01 December 1997

JOSÉ M. GOÑI
Affiliation:
E.T.S.I. Telecomunicación, Universidad Politécnica de Madrid, 28040 Madrid, Spain; e-mail: jmg@mat.upm.es, jcg@gsi.dit.upm.es
JOSÉ C. GONZÁLEZ
Affiliation:
E.T.S.I. Telecomunicación, Universidad Politécnica de Madrid, 28040 Madrid, Spain; e-mail: jmg@mat.upm.es, jcg@gsi.dit.upm.es
ANTONIO MORENO
Affiliation:
Dept. de Lingüística, Universidad Autónoma de Madrid, Cantoblanco, Madrid, Spain; e-mail: sandoval@lola.lllf.uam.es

Abstract

We present a lexical platform that has been developed for the Spanish language. It achieves portability between different computer systems and efficiency, in terms of speed and lexical coverage. A model for the full treatment of Spanish inflectional morphology for verbs, nouns and adjectives is presented. This model permits word formation based solely on morpheme concatenation, driven by a feature-based unification grammar. The run-time lexicon is a collection of allomorphs for both stems and endings. Although not tested, it should be suitable also for other Romance and highly inflected languages. A formalism is also described for encoding a lemma-based lexical source, well suited for expressing linguistic generalizations: inheritance classes, lemma encoding, morpho-graphemic allomorphy rules and limited type-checking. From this source base, we can automatically generate an allomorph indexed dictionary adequate for efficient retrieval and processing. A set of software tools has been implemented around this formalism: lexical base augmenting aids, lexical compilers to build run-time dictionaries and access libraries for them, feature manipulation libraries, unification and pseudo-unification modules, morphological processors, a parsing system, etc. Software interfaces among the different modules and tools are cleanly defined to ease software integration and tool combination in a flexible way. Directions for accessing our e-mail and web demonstration prototypes are also provided. Some figures are given, showing the lexical coverage of our platform compared to some popular spelling checkers.

Type
Research Article
Copyright
© 1997 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This work has been supported in part by the Spanish Plan Nacional de I+D, through the project TIC91-0217C02-01, ARIES: AnARchitecture for Natural LanguageInterfacESwith User Modelling.