Reducing repetition in convolutional abstractive summarization

Yizhu Liu; Xinyue Chen; Xusheng Luo; Kenny Q. Zhu

doi:10.1017/S1351324921000309

Reducing repetition in convolutional abstractive summarization

Published online by Cambridge University Press: 24 November 2021

Xusheng Luo and

Yizhu Liu: Affiliation:
Shanghai Jiao Tong, Shanghai, China
Xinyue Chen: Affiliation:
Carnegie Mellon University, Pennsylvania, USA
Xusheng Luo: Affiliation:
Alibaba Group, Hangzhou, China
Kenny Q. Zhu*: Affiliation:
Shanghai Jiao Tong, Shanghai, China
*: *Corresponding author: Email: kzhu@cs.sjtu.edu.cn

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Convolutional sequence to sequence (CNN seq2seq) models have met success in abstractive summarization. However, their outputs often contain repetitive word sequences and logical inconsistencies, limiting the practicality of their application. In this paper, we find the reasons behind the repetition problem in CNN-based abstractive summarization through observing the attention map between the summaries with repetition and their corresponding source documents and mitigate the repetition problem. We propose to reduce the repetition in summaries by attention filter mechanism (ATTF) and sentence-level backtracking decoder (SBD), which dynamically redistributes attention over the input sequence as the output sentences are generated. The ATTF can record previously attended locations in the source document directly and prevent the decoder from attending to these locations. The SBD prevents the decoder from generating similar sentences more than once via backtracking at test. The proposed model outperforms the baselines in terms of ROUGE score, repeatedness, and readability. The results show that this approach generates high-quality summaries with minimal repetition and makes the reading experience better.

Keywords

Abstractive summarization Repetition reduction Convolutional sequence-to-sequence model Attention mechanism

Information

Type: Article
Information: Natural Language Engineering , Volume 29 , Issue 1 , January 2023 , pp. 81 - 109

DOI: https://doi.org/10.1017/S1351324921000309 [Opens in a new window]
Copyright: © The Author(s), 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Allahyari, M., Pouriyeh, S.A., Assefi, M., Safaei, S., Trippe, E.D., Gutierrez, J.B., and Kochut, K. J. (2017). Text summarization techniques: A brief survey. arXiv, abs/1707.02268.CrossRef Google Scholar

Bae, S., Kim, T., Kim, J. and Lee, S. (2019). Summary level training of sentence rewriting for abstractive summarization. arXiv, abs/1909.08752.CrossRef Google Scholar

Bai, S., Kolter, J.Z. and Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv, abs/1803.01271.Google Scholar

Bing, L., Li, P., Liao, Y., Lam, W., Guo, W. and Passonneau, R.J. (2015). Abstractive multi-document summarization via phrase selection and merging. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers. Association for Computer Linguistics, pp. 1587–1597.CrossRef Google Scholar

Bokaei, M.H., Sameti, H. and Liu, Y. (2016). Extractive summarization of multi-party meetings through discourse segmentation. Natural Language Engineering 22(1), 41–72.CrossRef Google Scholar

Briscoe, T. (1996). The syntax and semantics of punctuation and its use in interpretation. In Proceedings of the Association for Computational Linguistics Workshop on Punctuation, pp. 1–7.Google Scholar

Bromley, J., Bentz, J.W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., Säckinger, E. and Shah, R. (1993). Signature verification using A “siamese” time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence 7(4), 669–688.CrossRef Google Scholar

Carenini, G. and Cheung, J.C.K. (2008). Extractive vs. NLG-based abstractive summarization of evaluative text: The effect of corpus controversiality. In INLG 2008 - Proceedings of the Fifth International Natural Language Generation Conference, June 12–14, 2008, Salt Fork, Ohio, USA. The Association for Computer Linguistics.Google Scholar

Çelikyilmaz, A., Bosselut, A., He, X. and Choi, Y. (2018). Deep communicating agents for abstractive summarization. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers). Association for Computational Linguistics, pp. 1662–1675.CrossRef Google Scholar

Chen, Q., Zhu, X., Ling, Z., Wei, S. and Jiang, H. (2016). Distraction-based neural networks for modeling document. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016. IJCAI/AAAI Press, pp. 2754–2760.Google Scholar

Chen, Y. and Bansal, M. (2018). Fast abstractive summarization with reinforce-selected sentence rewriting. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers. Association for Computational Linguistics, pp. 675–686.CrossRef Google Scholar

Chopra, S., Auli, M. and Rush, A. M. (2016). Abstractive sentence summarization with attentive recurrent neural networks. In NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016. Association for Computational Linguistics, pp. 93–98.CrossRef Google Scholar

Dauphin, Y.N., Fan, A., Auli, M. and Grangier, D. (2017). Language modeling with gated convolutional networks. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017. PMLR, pp. 933–941.Google Scholar

Devlin, J., Chang, M., Lee, K. and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, pp. 4171–4186.Google Scholar

Dong, L., Yang, N., Wang, W., Wei, F., Liu, X., Wang, Y., Gao, J., Zhou, M. and Hon, H. (2019). Unified language model pre-training for natural language understanding and generation. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pp. 13042–13054.Google Scholar

Fan, A., Grangier, D. and Auli, M. (2018). Controllable abstractive summarization. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, NMT@ACL 2018, Melbourne, Australia, July 20, 2018. Association for Computational Linguistics, pp. 45–54.CrossRef Google Scholar

Gehring, J., Auli, M., Grangier, D., Yarats, D. and Dauphin, Y. N. (2017). Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017. PMLR, pp. 1243–1252.Google Scholar

Gehrmann, S., Deng, Y., and Rush, A. M. (2018). Bottom-up abstractive summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. Association for Computational Linguistics, pp. 4098–4109.CrossRef Google Scholar

Grusky, M., Naaman, M. and Artzi, Y. (2018). Newsroom: A dataset of 1.3 million summaries with diverse extractive strategies. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers). Association for Computational Linguistics, pp. 708–719.Google Scholar

He, K., Zhang, X., Ren, S. and Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. IEEE Computer Society, pp. 770–778.CrossRef Google Scholar

Hermann, K.M., Kociský, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M. and Blunsom, P. (2015). Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp. 1693–1701.Google Scholar

Jiang, Y. and Bansal, M. (2018). Closed-book training to improve summarization encoder memory. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. Association for Computational Linguistics, pp. 4067–4077.CrossRef Google Scholar

Kim, S. (2019). Deep recurrent neural networks with layer-wise multi-head attentions for punctuation restoration. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. IEEE, pp. 7280–7284.CrossRef Google Scholar

Kulesza, A. and Taskar, B. (2011). k-DPPs: Fixed-size determinantal point processes. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011. Omnipress, pp. 1193–1200.Google Scholar

Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V. and Zettlemoyer, L. (2020). BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, pp. 7871–7880.CrossRef Google Scholar

Li, L., Liu, W., Litvak, M., Vanetik, N. and Huang, Z. (2019a). In conclusion not repetition: Comprehensive abstractive summarization with diversified attention based on determinantal point processes. In Proceedings of the 23rd Conference on Computational Natural Language Learning, CoNLL 2019, Hong Kong, China, November 3-4, 2019. Association for Computational Linguistics, pp. 822–832.CrossRef Google Scholar

Li, W., He, L. and Zhuge, H. (2016). Abstractive news summarization based on event semantic link network. In COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11-16, 2016, Osaka, Japan. Association for Computer Linguistics, pp. 236–246.Google Scholar

Li, W., Xiao, X., Lyu, Y. and Wang, Y. (2018a). Improving neural abstractive document summarization with explicit information selection modeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. Association for Computational Linguistics, pp. 1787–1796.CrossRef Google Scholar

Li, W., Xiao, X., Lyu, Y. and Wang, Y. (2018b). Improving neural abstractive document summarization with structural regularization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, pp. 4078–4087. Association for Computational Linguistics.CrossRef Google Scholar

Li, X. L., Wang, D. and Eisner, J. (2019b). A generative model for punctuation in dependency trees. Transactions of the Association for Computational Linguistics 7, 357–373.CrossRef Google Scholar

Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pp. 74–81, Barcelona, Spain.Google Scholar

Lin, J., Sun, X., Ma, S. and Su, Q. (2018). Global encoding for abstractive summarization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers. Association for Computational Linguistics, pp. 163–169.CrossRef Google Scholar

Liu, Y., Jia, Q. and Zhu, K. Q. (2021). Keyword-aware abstractive summarization by extracting set-level intermediate summaries. In WWW 2021: The Web Conference 2021, Virtual Event/Ljubljana, Slovenia, April 19-23, 2021. ACM/IW3C2, pp. 3042–3054.CrossRef Google Scholar

Liu, Y. and Lapata, M. (2019). Hierarchical transformers for multi-document summarization. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, pp. 5070–5081.CrossRef Google Scholar

Liu, Y., Luo, Z. and Zhu, K. Q. (2018). Controlling length in abstractive summarization using a convolutional neural network. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. Association for Computational Linguistics, pp. 4110–4119.CrossRef Google Scholar

Loukina, A., Zechner, K. and Chen, L. (2014). Automatic evaluation of spoken summaries: the case of language assessment. In Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications, BEA@ACL 2014, June 26, 2014, Baltimore, Maryland, USA. Association for Computer Linguistics, pp. 68–78.CrossRef Google Scholar

Nallapati, R., Zhou, B., dos Santos, C. N., Gülçehre, Ç. and Xiang, B. (2016). Abstractive text summarization using sequence-to-sequence rnns and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016, Berlin, Germany, August 11-12, 2016. Association for Computational Linguistics, pp. 280–290.CrossRef Google Scholar

Naserasadi, A., Khosravi, H. and Sadeghi, F. (2019). Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem.CrossRef Google Scholar

Nguyen, M., Cuong, T.V., Nguyen, X.H. and Nguyen, L. (2019). Web document summarization by exploiting social context with matrix co-factorization. Information Processing and Management 56(3), 495–515.CrossRef Google Scholar

Pallotta, V., Delmonte, R. and Bristot, A. (2009). Abstractive summarization of voice communications. In Human Language Technology. Challenges for Computer Science and Linguistics - 4th Language and Technology Conference, LTC 2009, Poznan, Poland, November 6-8, 2009, Revised Selected Papers. Springer, pp. 291–302.Google Scholar

Pascanu, R., Mikolov, T. and Bengio, Y. (2013). On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013. JMLR.org, pp. 1310–1318.Google Scholar

Paulus, R., Xiong, C. and Socher, R. (2018). A deep reinforced model for abstractive summarization. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net.Google Scholar

Qi, W., Yan, Y., Gong, Y., Liu, D., Duan, N., Chen, J., Zhang, R. and Zhou, M. (2020). Prophetnet: Predicting future n-gram for sequence-to-sequence pre-training. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL. Association for Computational Linguistics, pp. 2401–2410.CrossRef Google Scholar

Radev, D. R., Hovy, E. H. and McKeown, K. R. (2002). Introduction to the special issue on summarization. Computational Linguistics 28(4), 399–408.CrossRef Google Scholar

Rush, A. M., Chopra, S. and Weston, J. (2015). A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015. Association for Computational Linguistics, pp. 379–389.CrossRef Google Scholar

Sankarasubramaniam, Y., Ramanathan, K. and Ghosh, S. (2014). Text summarization using wikipedia. Information Processing and Management 50(3), 443–461.CrossRef Google Scholar

See, A., Liu, P.J. and Manning, C.D. (2017). Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers. Association for Computational Linguistics, pp. 1073–1083.CrossRef Google Scholar

Sharma, E., Huang, L., Hu, Z. and Wang, L. (2019). An entity-driven framework for abstractive summarization. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. Association for Computer Linguistics, pp. 3278–3289.CrossRef Google Scholar

Shi, T., Keneshloo, Y., Ramakrishnan, N. and Reddy, C.K. (2021). Neural abstractive text summarization with sequence-to-sequence models. Transactions on data Science 2(1), 1:1–1:37.CrossRef Google Scholar

Sutskever, I., Martens, J., Dahl, G. E. and Hinton, G. E. (2013). On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013. JMLR.org, pp. 1139–1147.Google Scholar

Suzuki, J. and Nagata, M. (2017). Cutting-off redundant repeating generations for neural abstractive summarization. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, April 3-7, 2017, Volume 2: Short Papers. Association for Computational Linguistics, pp. 291–297.CrossRef Google Scholar

Tan, J., Wan, X. and Xiao, J. (2017). Abstractive document summarization with a graph-based attentional neural model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers. Association for Computational Linguistics, pp. 1171–1181.CrossRef Google Scholar

Vaswani, A., Bengio, S., Brevdo, E., Chollet, F., Gomez, A. N., Gouws, S., Jones, L., Kaiser, L., Kalchbrenner, N., Parmar, N., Sepassi, R., Shazeer, N. and Uszkoreit, J. (2018). Tensor2tensor for neural machine translation. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas, AMTA 2018, Boston, MA, USA, March 17-21, 2018 - Volume 1: Research Papers. Association for Machine Translation in the Americas, pp. 193–199.Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pp. 5998–6008.Google Scholar

Verma, R. M. and Lee, D. (2017). Extractive summarization: Limits, compression, generalized model and heuristics. Computación y Sistemas 21(4).Google Scholar

Wang, K., Quan, X. and Wang, R. (2019). Biset: Bi-directional selective encoding with template for abstractive summarization. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, pp. 2153–2162.CrossRef Google Scholar

Yao, J., Wan, X. and Xiao, J. (2017). Recent advances in document summarization. Knowledge and Information Systems 53(2), 297–336.CrossRef Google Scholar

Zhang, J., Zhao, Y., Saleh, M. and Liu, P. J. (2020). PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research. PMLR, pp. 11328–11339.Google Scholar

Zhang, X., Wei, F. and Zhou, M. (2019a). HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, pp. 5059–5069.CrossRef Google Scholar

Zhang, Y., Li, D., Wang, Y., Fang, Y. and Xiao, W. (2019b). Abstract text summarization with a convolutional seq2seq model. Applied Sciences 9, 1665.CrossRef Google Scholar

Zhong, M., Liu, P., Chen, Y., Wang, D., Qiu, X. and Huang, X. (2020). Extractive summarization as text matching. In Jurafsky, D., Chai, J., Schluter, N. and Tetreault, J.R. (eds), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, pp. 6197–6208.CrossRef Google Scholar

Zhong, M., Liu, P., Wang, D., Qiu, X. and Huang, X. (2019). Searching for effective neural extractive summarization: What works and what’s next. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computer Linguistics, pp. 1049–1058.CrossRef Google Scholar

Article contents

Reducing repetition in convolutional abstractive summarization

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests