SentMask: A Sentence-Aware Mask Attention-Guided Two-Stage Text Summarization Component

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Intelligent Systems Pub Date : 2023-08-22 DOI:10.1155/2023/1267336

Rui Zhang, Nan Zhang, Jianjun Yu

{"title":"SentMask: A Sentence-Aware Mask Attention-Guided Two-Stage Text Summarization Component","authors":"Rui Zhang, Nan Zhang, Jianjun Yu","doi":"10.1155/2023/1267336","DOIUrl":null,"url":null,"abstract":"<div>\n <p>The text summarization task aims to generate succinct sentences that summarise what an article tries to express. Based on pretrained language models, combining extractive and abstractive summarization approaches has been widely adopted in text summarization tasks. It has been proven to be effective in many existing pieces of research using extract-then-abstract algorithms. However, this method suffers from semantic information loss throughout the extraction process, resulting in incomprehensive sentences being generated during the abstract phase. Besides, current research on text summarization emphasizes only word-level comprehension while paying little attention to understanding the level of the sentence. To tackle this problem, in this paper, we propose the SentMask component. Taking into account that the semantics of sentences that are filtered out during the extraction process is also worth considering, the paper designs a sentence-aware mask attention mechanism in the process of generating a text summary. By applying the extractive approach, the paper first selects the most essential sentences to construct the initial summary phrases. This information leads the model to modify the weights of the attention mechanism, which provides supervision for the generative model to ensure that it focuses on the sentences that convey important semantics while not ignoring others. The final summary is constructed based on the key information provided. The experimental results demonstrate that our model achieves higher ROUGE and BLEU scores compared to other baseline models on two benchmark datasets.</p>\n </div>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2023 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2023/1267336","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2023/1267336","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The text summarization task aims to generate succinct sentences that summarise what an article tries to express. Based on pretrained language models, combining extractive and abstractive summarization approaches has been widely adopted in text summarization tasks. It has been proven to be effective in many existing pieces of research using extract-then-abstract algorithms. However, this method suffers from semantic information loss throughout the extraction process, resulting in incomprehensive sentences being generated during the abstract phase. Besides, current research on text summarization emphasizes only word-level comprehension while paying little attention to understanding the level of the sentence. To tackle this problem, in this paper, we propose the SentMask component. Taking into account that the semantics of sentences that are filtered out during the extraction process is also worth considering, the paper designs a sentence-aware mask attention mechanism in the process of generating a text summary. By applying the extractive approach, the paper first selects the most essential sentences to construct the initial summary phrases. This information leads the model to modify the weights of the attention mechanism, which provides supervision for the generative model to ensure that it focuses on the sentences that convey important semantics while not ignoring others. The final summary is constructed based on the key information provided. The experimental results demonstrate that our model achieves higher ROUGE and BLEU scores compared to other baseline models on two benchmark datasets.

Abstract Image

查看原文本刊更多论文

SentMask:一个句子感知掩码注意引导的两阶段文本摘要组件

文本摘要任务旨在生成简洁的句子，总结文章试图表达的内容。基于预训练语言模型的抽取和抽象相结合的摘要方法在文本摘要任务中被广泛采用。它已被证明是有效的，在许多现有的研究使用提取，然后抽象算法。然而，这种方法在提取过程中存在语义信息丢失的问题，导致在抽象阶段生成不全面的句子。此外，目前对文本摘要的研究只强调单词层面的理解，很少关注句子层面的理解。为了解决这个问题，在本文中，我们提出了SentMask组件。考虑到提取过程中被过滤掉的句子的语义也值得考虑，本文设计了文本摘要生成过程中句子感知的掩码注意机制。本文采用提取的方法，首先选取最重要的句子来构建初始的摘要短语。这些信息引导模型修改注意机制的权重，这为生成模型提供了监督，以确保它关注传达重要语义的句子，而不忽略其他句子。最后的摘要是基于所提供的关键信息构造的。实验结果表明，与其他基线模型相比，我们的模型在两个基准数据集上获得了更高的ROUGE和BLEU分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Intelligent Systems 工程技术-计算机：人工智能

CiteScore

11.30

自引率

14.30%

发文量

304

审稿时长

9 months

期刊介绍： The International Journal of Intelligent Systems serves as a forum for individuals interested in tapping into the vast theories based on intelligent systems construction. With its peer-reviewed format, the journal explores several fascinating editorials written by today''s experts in the field. Because new developments are being introduced each day, there''s much to be learned — examination, analysis creation, information retrieval, man–computer interactions, and more. The International Journal of Intelligent Systems uses charts and illustrations to demonstrate these ground-breaking issues, and encourages readers to share their thoughts and experiences.