Hostname: page-component-78c5997874-mlc7c Total loading time: 0 Render date: 2024-11-10T13:40:54.956Z Has data issue: false hasContentIssue false

Paraphrasing spoken Chinese using a paraphrase corpus

Published online by Cambridge University Press:  10 November 2005

YUJIE ZHANG
Affiliation:
National Institute of Information and Communications Technology, 3-5, Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0289, Japan e-mail: yujie@nict.go.jp
KAZUHIDE YAMAMOTO
Affiliation:
Nagaoka University of Technology, Niigata 940-2188 Japan e-mail: yamamoto@fw.ipsj.or.jp

Abstract

One of the key issues in spoken-language translation is how to deal with unrestricted expressions in spontaneous utterances. We have developed a paraphraser for use as part of a translation system, and in this paper we describe the implementation of a Chinese paraphraser for a Chinese-Japanese spoken-language translation system. When an input sentence cannot be translated by the transfer engine, the paraphraser automatically transforms the sentence into alternative expressions until one of these alternatives can be translated by the transfer engine. Two primary issues must be dealt with in paraphrasing: how to determine new expressions, and how to retain the meaning of the input sentence. We use a pattern-based approach in which the meaning is retained to the greatest possible extent without deep parsing. The paraphrase patterns are acquired from a paraphrase corpus and human experience. The paraphrase instances are automatically extracted and then generalized into paraphrase patterns. A total of 1719 paraphrase patterns obtained using this method and an implemented paraphraser were used in a paraphrasing experiment. The results showed that the implemented paraphraser generated 1.7 paraphrases on average for each test sentence and achieved an accuracy of 88%.

Type
Papers
Copyright
2005 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)