Text Summarization Based on Sentence Selection with Semantic Representation

Chi Zhang, Lei Zhang, Chong-Jun Wang, Junyuan Xie
{"title":"Text Summarization Based on Sentence Selection with Semantic Representation","authors":"Chi Zhang, Lei Zhang, Chong-Jun Wang, Junyuan Xie","doi":"10.1109/ICTAI.2014.93","DOIUrl":null,"url":null,"abstract":"Text summarization is of great importance to solve information overload. Salience and coverage are two most important issues for summaries. Most existing models extract summaries by selecting the top sentences with highest scores without using the relationships between sentences, and usually represent the sentences simply basing on lexical or statistical features. As a result, those models can not achieve salience or coverage very well. In this paper, we propose a novel summarization model called Sentence Selection with Semantic Representation (SSSR). SSSR ensures both salience and coverage by learning semantic representations for sentences and applying a well-designed selection strategy to select summary sentences. The selection strategy used in SSSR is to select sentences that can reconstruct the original document with least distortion by means of linear combination. Besides, we improve our selection strategy by reducing redundant information. Then we learn two semantic representations for sentences: (1) weighted mean of word embeddings, (2) deep coding. Both of them are semantic and compact, and can capture similarities between sentences. Extensive experiments on datasets DUC2006 and DUC2007 validate our model.","PeriodicalId":142794,"journal":{"name":"2014 IEEE 26th International Conference on Tools with Artificial Intelligence","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 26th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2014.93","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Text summarization is of great importance to solve information overload. Salience and coverage are two most important issues for summaries. Most existing models extract summaries by selecting the top sentences with highest scores without using the relationships between sentences, and usually represent the sentences simply basing on lexical or statistical features. As a result, those models can not achieve salience or coverage very well. In this paper, we propose a novel summarization model called Sentence Selection with Semantic Representation (SSSR). SSSR ensures both salience and coverage by learning semantic representations for sentences and applying a well-designed selection strategy to select summary sentences. The selection strategy used in SSSR is to select sentences that can reconstruct the original document with least distortion by means of linear combination. Besides, we improve our selection strategy by reducing redundant information. Then we learn two semantic representations for sentences: (1) weighted mean of word embeddings, (2) deep coding. Both of them are semantic and compact, and can capture similarities between sentences. Extensive experiments on datasets DUC2006 and DUC2007 validate our model.
基于语义表示的句子选择的文本摘要
文本摘要是解决信息过载的重要手段。摘要的突出性和覆盖面是两个最重要的问题。现有的大多数模型都是通过选择得分最高的句子来提取摘要,而不使用句子之间的关系,通常是简单地基于词汇或统计特征来表示句子。因此,这些模型不能很好地实现突出性或覆盖率。本文提出了一种新的摘要模型——基于语义表示的句子选择模型(SSSR)。SSSR通过学习句子的语义表示和应用精心设计的选择策略来选择总结句,从而确保显著性和覆盖率。SSSR中使用的选择策略是通过线性组合的方式选择能够以最小的失真重构原始文档的句子。此外,我们通过减少冗余信息来改进我们的选择策略。然后我们学习了句子的两种语义表示:(1)词嵌入的加权平均值,(2)深度编码。这两种方法都具有语义性和紧凑性,可以捕捉句子之间的相似之处。在DUC2006和DUC2007数据集上的大量实验验证了我们的模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信