Automatic annotation of context and speech acts for dialogue corpora

KALLIRROI GEORGILA; OLIVER LEMON; JAMES HENDERSON; JOHANNA D. MOORE

doi:10.1017/S1351324909005105

Automatic annotation of context and speech acts for dialogue corpora

Published online by Cambridge University Press: 01 July 2009

JAMES HENDERSON and

KALLIRROI GEORGILA: Affiliation:
Institute for Creative Technologies, University of Southern California, 13274 Fiji Way, Marina del Rey, CA 90292, USA e-mail: kgeorgila@ict.usc.edu
OLIVER LEMON: Affiliation:
School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, EH8 9AB, UK, e-mail: olemon@inf.ed.ac.uk, j.moore@ed.ac.uk
JAMES HENDERSON: Affiliation:
Department of Computer Science, University of Geneva, Battelle bâtiment A, 7 route de Drize, 1227 Carouge, Switzerland e-mail: james.henderson@cui.unige.ch
JOHANNA D. MOORE: Affiliation:
School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, EH8 9AB, UK, e-mail: olemon@inf.ed.ac.uk, j.moore@ed.ac.uk

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Richly annotated dialogue corpora are essential for new research directions in statistical learning approaches to dialogue management, context-sensitive interpretation, and context-sensitive speech recognition. In particular, large dialogue corpora annotated with contextual information and speech acts are urgently required. We explore how existing dialogue corpora (usually consisting of utterance transcriptions) can be automatically processed to yield new corpora where dialogue context and speech acts are accurately represented. We present a conceptual and computational framework for generating such corpora. As an example, we present and evaluate an automatic annotation system which builds ‘Information State Update’ (ISU) representations of dialogue context for the Communicator (2000 and 2001) corpora of human–machine dialogues (2,331 dialogues). The purposes of this annotation are to generate corpora for reinforcement learning of dialogue policies, for building user simulations, for evaluating different dialogue strategies against a baseline, and for training models for context-dependent interpretation and speech recognition. The automatic annotation system parses system and user utterances into speech acts and builds up sequences of dialogue context representations using an ISU dialogue manager. We present the architecture of the automatic annotation system and a detailed example to illustrate how the system components interact to produce the annotations. We also evaluate the annotations, with respect to the task completion metrics of the original corpus and in comparison to hand-annotated data and annotations produced by a baseline automatic system. The automatic annotations perform well and largely outperform the baseline automatic annotations in all measures. The resulting annotated corpus has been used to train high-quality user simulations and to learn successful dialogue strategies. The final corpus will be made publicly available.

Information

Type: Papers
Information: Natural Language Engineering , Volume 15 , Issue 3 , July 2009 , pp. 315 - 353

DOI: https://doi.org/10.1017/S1351324909005105 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Andreani, G., Di Fabbrizio, G., Gilbert, M., Gillick, D., Hakkani-Tür, D., and Lemon, O. 2006. Let's DiSCoH: collecting an annotated open corpus with dialogue acts and reward signals for natural language helpdesks. In Proceedings of the IEEE/ACL Workshop on Spoken Language Technology, Aruba, 2006, pp. 218–21.Google Scholar

Bos, J., Klein, E., Lemon, O., and Oka, T. 2003. DIPPER: description and formalisation of an Information-State Update dialogue system architecture. In Proceedings of the 4th SIGdial Workshop on Discourse and Dialogue, Sapporo, Japan, pp. 115–24.Google Scholar

Cheyer, A., and Martin, D. 2001. The open agent architecture. Journal of Autonomous Agents and Multi-Agent Systems, 4 (1/2): 143–8.CrossRef Google Scholar

Clark, H. H., and Brennan, S. E. 1991. Grounding in communication. In Resnick, L., Levine, J., and Teasely, S. (eds.), Perspectives on Socially Shared Cognition, pp. 127–49. American Psychological Association.CrossRef Google Scholar

Core, M. G., Moore, J. D., and Zinn, C. 2003. The role of initiative in tutorial dialogue. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Budapest, Hungary, pp. 67–74.Google Scholar

Frampton, M., and Lemon, O. 2006. Learning more effective dialogue strategies using limited dialogue move features. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics (ACL), Sydney, Australia, pp. 185–92.Google Scholar

Gabsdil, M., and Lemon, O. 2004. Combining acoustic and pragmatic features to predict recognition performance in spoken dialogue systems. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL), Barcelona, Spain, pp. 344–51.Google Scholar

Georgila, K., Henderson, J., and Lemon, O. 2005a. Learning user simulations for Information State Update dialogue systems. In Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH–EUROSPEECH), Lisbon, Portugal, pp. 893–6.Google Scholar

Georgila, K., Henderson, J., and Lemon, O. 2006. User simulation for spoken dialogue systems: learning and evaluation. In Proceedings of the 9th International Conference on Spoken Language Processing (INTERSPEECH–ICSLP), Pittsburgh, PA, pp. 1065–68.Google Scholar

Georgila, K., Lemon, O., and Henderson, J. 2005b. Automatic annotation of COMMUNICATOR dialogue data for learning dialogue strategies and user simulations. In Proceedings of the 9th Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL: DIALOR), Nancy, France, pp. 61–8.Google Scholar

Georgila, K., Wolters, M., Karaiskos, V., Kronenthal, M., Logie, R., Mayo, N., Moore, J. D., and Watson, M. 2008a. A fully annotated corpus for studying the effect of cognitive ageing on users' interactions with spoken dialogue systems. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Marrakech, Morocco, pp. 938–44.Google Scholar

Georgila, K., Wolters, M., and Moore, J. D. 2008b. Simulating the behaviour of older versus younger users when interacting with spoken dialogue systems. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL–HLT), Columbus, OH, pp. 49–52.Google Scholar

Ginzburg, J. 1996. Dynamics and semantics of dialogue. In Seligman, Jerry and Westerstahl, Dag (eds.), Logic, Language, and Computation, Vol. 1. CSLI Publications, Stanford, CA.Google Scholar

Grosz, B. J., and Sidner, C. L. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 12 (3): 175–204.Google Scholar

Henderson, J., Lemon, O., and Georgila, K. 2005. Hybrid reinforcement/supervised learning for dialogue policies from COMMUNICATOR data. In Proceedings of the 4th Workshop on Knowledge and Reasoning in Practical Dialogue Systems, International Joint Conference on Artificial Intelligence (IJCAI), Edinburgh, UK, pp. 68–75.Google Scholar

Henderson, J., Lemon, O., and Georgila, K. 2008. Hybrid reinforcement/supervised learning of dialogue policies from fixed datasets. Computational Linguistics 34 (4): 487–511.CrossRef Google Scholar

Keizer, S., and op den Akker, R. 2007. Dialogue act recognition under uncertainty using Bayesian networks. Journal of Natural Language Engineering 13 (4): 287–316.CrossRef Google Scholar

Kipp, M. 1998 The neural path to dialogue acts. In Proceedings of the 13th European Conference on Artificial Intelligence (ECAI), Brighton, UK, pp. 175–9.Google Scholar

Larsson, S., and Traum, D. 2000. Information state and dialogue management in the TRINDI Dialogue Move Engine Toolkit. Journal of Natural Language Engineering 6 (3–4): 323–40.CrossRef Google Scholar

Lemon, O., Georgila, K., and Henderson, J. 2006a. Evaluating effectiveness and portability of reinforcement learned dialogue strategies with real users: the TALK TownInfo evaluation. In Proceedings of the IEEE/ACL Workshop on Spoken Language Technology, Aruba, pp. 178–81.Google Scholar

Lemon, O., Georgila, K., Henderson, J., and Stuttle, M. 2006b. An ISU dialogue system exhibiting reinforcement learning of dialogue policies: generic slot-filling in the TALK in-car system. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Trento, Italy, pp. 119–22.Google Scholar

Lemon, O., and Gruenstein, A. 2003. Multithreaded context for robust conversational interfaces: context-sensitive speech recognition and interpretation of corrective fragments. ACM Transactions on Computer–Human Interaction (ACM TOCHI), 11 (3): 241–67.CrossRef Google Scholar

Lesch, S., Kleinbauer, T., and Alexandersson, J. 2005. A new metric for the evaluation of dialog act classification. In Proceedings of the 9th Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL: DIALOR), Nancy, France, pp. 143–6.Google Scholar

Levin, E., Pieraccini, R., and Eckert, W. 2000. A stochastic model of human–machine interaction for learning dialog strategies. IEEE Transactions on Speech and Audio Processing 1: 11–23.CrossRef Google Scholar

Litman, D., and Forbes-Riley, K. 2006. Correlations between dialogue acts and learning in spoken tutoring dialogues. Journal of Natural Language Engineering 12 (2): 161–76.CrossRef Google Scholar

Poesio, M., Cooper, R., Larsson, S., Matheson, C., and Traum, D. 1999. Annotating conversations for information state update. In Proceedings of the 3rd Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL: AMSTELOGUE), Amsterdam, Netherlands.Google Scholar

Reithinger, N., and Klesen, M. 1997. Dialogua act classification using language models. In Proceedings of the 5th European Conference on Speech Communication and Technology (EUROSPEECH), Rhodes, Greece, pp. 2235–8.Google Scholar

Reithinger, N., and Maier, E. 1995. Utilizing statistical dialogue act processing in VERBMOBIL. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL), Cambridge, MA, pp. 116–21.Google Scholar

Ries, K. 1999. HMM and neural network based speech act detection. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Phoenix, AZ, pp. 497–500.Google Scholar

Rieser, V., Kruijff-Korbayová, I., and Lemon, O. 2005a. A corpus collection and annotation framework for learning multimodal clarification strategies. In Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue, Lisbon, Portugal, pp. 97–106.Google Scholar

Rieser, V., Kruijff-Korbayová, I., and Lemon, O. 2005b. A framework for learning multimodal clarification strategies. In Proceedings of the 7th International Conference on Multimodal Interfaces (ICMI), Trento, Italy.Google Scholar

Rieser, V., and Lemon, O. 2006. Using machine learning to explore human multimodal clarification strategies. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics (ACL), Sydney, Australia, pp. 659–66.Google Scholar

Rieser, V., and Lemon, O. 2008. Learning effective multimodal dialogue strategies from Wizard-of-Oz data: Bootstrapping and evaluation. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL–HLT), Columbus, OH, pp. 638–46.Google Scholar

Samuel, K., Carberry, S., and Vijay-Shanker, K. 1998. Dialogue act tagging with transformation-based learning. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (ACL–COLING), Montreal, Quebec, Canada, pp. 1150–6.Google Scholar

Schatzmann, J., Georgila, K., and Young, S. 2005a Quantitative evaluation of user simulation techniques for spoken dialogue systems. In Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue, Lisbon, Portugal, pp. 45–54.Google Scholar

Schatzmann, J., Stuttle, M. N., Weilhammer, K., and Young, S. 2005b. Effects of the user model on simulation-based learning of dialogue strategies. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, San Juan, Puerto Rico, pp. 220–5.Google Scholar

Schatzmann, J., Thomson, B., and Young, S. 2007. Statistical user simulation with a hidden agenda. In Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, Antwerp, Belgium, pp. 273–82.Google Scholar

Schatzmann, J., Weilhammer, K., Stuttle, N., and Young, S. 2006. A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. Knowledge Engineering Review 21 (2): 97–126.CrossRef Google Scholar

Scheffler, K., and Young, S. 2001. Corpus-based dialogue simulation for automatic strategy learning and evaluation. In Proceedings of the Workshop on Adaptation in Dialogue Systems, North American Chapter of the Association for Computational Linguistics (NAACL), Pittsburgh, PA, pp. 64–70.Google Scholar

Searle, J. 1969. Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press: Cambridge, UK.CrossRef Google Scholar

Singh, S., Kearns, M., Litman, D., and Walker, M. 1999. Reinforcement learning for spoken dialogue systems. Advances in Neural Information Processing Systems 12: 956–62.Google Scholar

Stolcke, A., Coccaro, N., Bates, R., Taylor, P., Ess-Dykema, C. V., Ries, K., Shriberg, E., Jurafsky, D., Martin, R., and Meteer, M. 2000. Dialogue act modelling for automatic tagging and recognition of conversational speech. Computational Linguistics 26 (3): 339–74.CrossRef Google Scholar

Traum, D. 1994. A computational theory of grounding in natural language conversation. PhD Thesis, Department of Computer Science, University of Rochester.Google Scholar

Traum, D. 2000. Twenty questions for dialogue act taxonomies. Journal of Semantics 17 (1): 7–30.CrossRef Google Scholar

Traum, D. R., and Allen, J. 1994. Discourse obligations in dialogue processing. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, pp. 1–8.Google Scholar

Walker, M., Aberdeen, J., Boland, J., Bratt, E., Garofolo, J., Hirschman, L., Le, A., Lee, S., Narayanan, S., Papineni, K., Pellom, B., Polifroni, J., Potamianos, A., Prabhu, P., Rudnicky, A., Sanders, G., Seneff, S., Stallard, D., and Whittaker, S. 2001a. DARPA Communicator dialog travel planning systems: The June 2000 data collection. In Proceedings of the 7th European Conference on Speech Communication and Technology (EUROSPEECH), Aalborg, Denmark, pp. 1371–4.Google Scholar

Walker, M. A., Fromer, J. C., and Narayanan, S. 1998. Learning optimal dialogue strategies: a case study of a spoken dialogue agent for email. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (ACL–COLING), Montreal, Quebec, Canada, pp. 1345–51.Google Scholar

Walker, M. A., Kamm, C. A., and Litman, D. J. 2000. Towards developing general models of usability with PARADISE. Journal of Natural Language Engineering (Special Issue on Best Practice in Spoken Dialogue Systems) 6 (3): 363–77.CrossRef Google Scholar

Walker, M. A., Passonneau, R. J., and Boland, J. E. 2001b. Quantitative and qualitative evaluation of DARPA Communicator spoken dialogue systems. In Proceedings of the 39th Meeting of the Association for Computational Linguistics (ACL), Toulouse, France, pp. 515–22.Google Scholar

Walker, M., and Passonneau, R. 2001. DATE: a dialogue act tagging scheme for evaluation of spoken dialogue systems. In Proceedings of the Human Language Technologies Conference, San Diego, CA, pp. 1–8.Google Scholar

Walker, M., Rudnicky, A., Aberdeen, J., Bratt, E., Garofolo, J., Hastie, H., Le, A., Pellom, B., Potamianos, A., Passonneau, R., Prasad, R., Roukos, S., Sanders, G., Seneff, S., Stallard, D., and Whittaker, S. 2002. DARPA communicator evaluation: progress from 2000 to 2001. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP), Denver, CO, pp. 273–6.Google Scholar

Webb, N., Hepple, M., and Wilks, Y. 2005. Dialogue act classification based on intra-utterance features. In Proceedings of the AAAI Workshop on Spoken Language Understanding, Pittsburgh, PA.CrossRef Google Scholar

Williams, J. D., and Young, S. 2005. Scaling up POMDPs for dialog management: the “summary POMDP” method. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, San Juan, Puerto Rico, pp. 177–82.Google Scholar

Young, S. 2000. Probabilistic methods in spoken dialogue systems. Philosophical Transactions of the Royal Society (Series A) 358 (1769): 1389–402.CrossRef Google Scholar

Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, T., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., and Woodland, P. 2005. The HTK Book (for HTK version 3.3). Cambridge University Engineering Department.Google Scholar

Zinn, C., Moore, J. D., and Core, M. G. 2002. A 3-tier planning architecture for managing tutorial dialogue. In Proceedings of the Intelligent Tutoring Systems, Sixth International Conference (ITS), Biarritz, France, pp. 574–84. Lecture Notes in Computer Science vol. 2363, Berlin: Springer.CrossRef Google Scholar

Article contents

Automatic annotation of context and speech acts for dialogue corpora

Abstract

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests