Experiments with Language-based Aids in Information Retrieval Systems

Tove Fjeldvig; Anne Golden

doi:10.1017/S0332586500001736

Experiments with Language-based Aids in Information Retrieval Systems

Published online by Cambridge University Press: 22 December 2008

Tove Fjeldvig and

Anne Golden

Show author details

Tove Fjeldvig: Affiliation:
Statens Datasentral a.s., Ulvenveien 89 B, N-0581 Oslo 5, Norway.
Anne Golden: Affiliation:
Institutt for norsk som fremmedspråk, Universitetet i Oslo, Blindern, N-0316, Oslo 3, Norway.

Article contents

Abstract
References

Get access

Abstract

The fact that a lexeme can appear in various forms causes problems in information retrieval. As a solution to this problem, we have developed methods for automatic root lemmatization, automatic truncation and automatic splitting of compound words. All the methods have as their basis a set of rules which contain information regarding inflected and derived forms of words – and not a dictionary. The methods have been tested on several collections of texts, and have produced very good results. By controlled experiments in text retrieval, we have studied the effects on search results. These results show that both the method of automatic root lemmatization and the method of automatic truncation make a considerable improvement on search quality. The experiments with splitting of compound words did not give quite the same improvement, however, but all the same this experiment showed that such a method could contribute to a richer and more complete search request.

Information

Type: Research Article
Information: Nordic Journal of Linguistics , Volume 11 , Issue 1-2 , June 1988 , pp. 33 - 46

DOI: https://doi.org/10.1017/S0332586500001736 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1988

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

REFERENCES

Fjeldvig, T. & Golden, A. 1984. Automatisk rotlemmatisering — et lingvistisk hjelpemiddel for tekstsøking. CompLex 9/84. Oslo: Universitetsforlaget.Google Scholar

Fjeldvig, T. 1986. Tekstsøking — teori, metoder og systemer. Oslo: Universitetsforlaget.Google Scholar

Fjeldvig, T. & Golden, A. 1986. Automatisk splitting av sammensatte ord — et lingvistisk hjelpemiddel for tekstsøking. In Karisson, F. (1986: 73–82).Google Scholar

Fjeldvig, T. 1987. Effektivisering av tekstsøkesystemer. Utvikling av språkbaserte metoder. CompLex 13/87. Oslo: Universitetsforlaget.Google Scholar

Gavare, R. 1979. Automatisk lemmatisering utan stamlexikon — Några synspunkter tio år efteråt. In Maegaard, B. (1979: 123–131).Google Scholar

Hellberg, S. 1971. Automatisk lemmatisering — En modell för upprättande av böjningsserier i ett frekvenslexikon. Gøteborg: Språkdata.Google Scholar

Karisson, F. (ed.) 1986. Papers from the 5th Scandinavian Conference of Computational Linguistics. Helsinki: University of Helsinki, Department of General Linguistics.Google Scholar

Källgren, G. 1985. En algoritm för delning av sammansatta ord i svenskan. Institutionen för lingvistik. Stockholm: Stockholms Universitet.Google Scholar

Maegaard, B. 1979. Nordiske datalingvistikdage i København 6–10. oktober 1979. Foredrag. København: Institut for anvendt og mateinatisk lingvistik, Københavns Universitet 1979.Google Scholar

Munthe, S. K. M. 1972. Sammensatte ord. En kvantitativ undersøkelse av norsk litteratur og sakprosa. Hovedfagsoppgave ved Nordisk institutt, Universitetet i Bergen og Oslo.Google Scholar

Niedermair, G. T., Thurmair, G. & Büttel, I. 1984. MARS —A Retrieval Tool on the Basis of Morphological Analysis. In van Rijsbergen (1984: 369–382).Google Scholar

Salton, G. 1968. Automatic Information Organization and Retrieval. McGraw- Hill Computer Series.Google Scholar

van, Rijsbergen C. J. 1984. Research and Development in Information Retrieval. Proceedings of the Third Joint BCS and ACM Symposium King's College, Cambridge 2–6 07 1984. British Computer Society Workshop Series.Google Scholar

Article contents

Experiments with Language-based Aids in Information Retrieval Systems

Abstract

Information

Access options

Article purchase

Temporarily unavailable

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests