Designing a machine translation system for Canadian weather warnings: A case study

FABRIZIO GOTTI; PHILIPPE LANGLAIS; GUY LAPALME

doi:10.1017/S135132491300003X

Designing a machine translation system for Canadian weather warnings: A case study

Published online by Cambridge University Press: 30 January 2013

FABRIZIO GOTTI ,

PHILIPPE LANGLAIS and

GUY LAPALME

Show author details

FABRIZIO GOTTI: Affiliation:
RALI-DIRO – Université de Montréal, C.P. 6128, Succ. Centre-Ville Montréal, Québec, CanadaH3C 3J7 email: gottif@iro.umontreal.ca, felipe@iro.umontreal.ca, lapalme@iro.umontreal.ca
PHILIPPE LANGLAIS: Affiliation:
RALI-DIRO – Université de Montréal, C.P. 6128, Succ. Centre-Ville Montréal, Québec, CanadaH3C 3J7 email: gottif@iro.umontreal.ca, felipe@iro.umontreal.ca, lapalme@iro.umontreal.ca
GUY LAPALME: Affiliation:
RALI-DIRO – Université de Montréal, C.P. 6128, Succ. Centre-Ville Montréal, Québec, CanadaH3C 3J7 email: gottif@iro.umontreal.ca, felipe@iro.umontreal.ca, lapalme@iro.umontreal.ca

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In this paper we describe the many steps involved in building a production quality Machine Translation system for translating weather warnings between French and English. Although in principle this task may seem straightforward, the details, especially corpus preparation and final text presentation, involve many difficult aspects that are often glossed over in the literature. On top of the classic Statistical Machine Translation evaluation metric results, four manual evaluations have been performed to assess and improve translation quality. We also show the usefulness of the integration of out-of-domain information sources in a Statistical Machine Translation system to produce high quality translated text.

Information

Type: Articles
Information: Natural Language Engineering , Volume 20 , Issue 3 , July 2014 , pp. 399 - 433

DOI: https://doi.org/10.1017/S135132491300003X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2013

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bertoldi, N., Haddow, B., and Fouet, J.-B. 2009. Improved minimum error rate training in moses. The Prague Bulletin of Mathematical Linguistics 91: 7–16.Google Scholar

Chandioux, J. 1988. meteo: an operational translation system. In Proceedings of the 2nd Conference on RIAO, Cambridge, MA, pp. 829–39.Google Scholar

Chen, S. F., and Goodman, J. 1999. An empirical study of smoothing techniques for language modeling. Computer Speech and Language (Elsevier) 13 (4): 359–93.CrossRef Google Scholar

Clark, J. H., Dyer, C., Lavie, A., and Smith, N. A. 2011. Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, OR. Stroudsburg, PA: Association for Computational Linguistics, pp. 176–81.Google Scholar

Foster, G., Kuhn, R., and Johnson, J. H. 2006. Phrasetable smoothing for statistical machine translation. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Stroudsburg, PA, pp. 53–61.Google Scholar

Isabelle, P. 1987. Machine translation at the TAUM group. In King, M. (ed.), Machine Translation Today: The State of the Art, pp. 247–77. Edinburgh, UK: Edinburgh University Press.Google Scholar

Johnson, J. H., Martin, J., Foster, G., and Kuhn, R. 2007. Improving translation quality by discarding most of the phrasetable. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, Stroudsburg, PA, pp. 967–75.Google Scholar

Koehn, P., Axelrod, A., Mayne, A. B., Callison-Burch, C., Osborne, M., and Talbot, D. 2005. Edinburgh system description for the 2005 IWSLT speech translation evaluation. In Proceedings of the International Workshop on Spoken Language Translation, Pittsburgh, PA.Google Scholar

Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. 2007. Moses open source toolkit for statistical machine translation. Annual Meeting of the Association for Computational Linguistics (ACL) 45 (2): 2.Google Scholar

Langlais, P. 1997. Alignement de corpus bilingues: intérêts, algorithmes et évaluations. In Actes du colloque international FRACTAL 1997, Linguistique et Informatique: Théories et Outils pour le Traitement Automatique des Langues, Besançon, France, pp. 245–54.Google Scholar

Langlais, P., Gandrabur, S., Leplus, T., and Lapalme, G. 2005. The long-term forecast for weather bulletin translation. Machine Translation 19 (1): 83–112 (Kluwer, Hingham, MA).Google Scholar

Macklovitch, E. 1985. A linguistic performance evaluation of METEO 2. Technical Report, Canadian Translation Bureau, Montreal, Canada.Google Scholar

Mitkov, R. 2005. The Oxford Handbook of Computational Linguistics. New York: Oxford University Press.Google Scholar

Och, F. J. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, pp. 160–7.Google Scholar

Papineni, K., Roukos, S., Ward, T., and Zhu, W. J. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, pp. 311–8.Google Scholar

Simard, M., and Deslauriers, A. 2001. Real-time automatic insertion of accents in French text. Natural Language Engineering, 7 (2): 143–65 (Cambridge University Press, New York).Google Scholar

Stolcke, A. 2002. SRILM – an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing (ICSLP-2002), Denver, CO, pp. 901–4.Google Scholar

Verret, R., Vigneux, D., Marcoux, J., Petrucci, F., Landry, C., Pelletier, L., and Hardy, G. 1997. Scribe 3.0, a product generator. In Proceedings of the 13th International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography and Hydrology, Long Beach, CA, pp. 392–5.Google Scholar

Article contents

Designing a machine translation system for Canadian weather warnings: A case study

Abstract

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests