基于蕴涵的文档摘要内容选择评分方法

Proceedings of the 9th International Symposium on Information and Communication Technology Pub Date : 2018-12-06 DOI:10.1145/3287921.3287976

Dang Hoang Long, Minh-Tien Nguyen, Ngo Xuan Bach, Le-Minh Nguyen, Tu Minh Phuong

{"title":"基于蕴涵的文档摘要内容选择评分方法","authors":"Dang Hoang Long, Minh-Tien Nguyen, Ngo Xuan Bach, Le-Minh Nguyen, Tu Minh Phuong","doi":"10.1145/3287921.3287976","DOIUrl":null,"url":null,"abstract":"This paper introduces a scoring method to improve the quality of content selection in an extractive summarization system. Different from previous models mainly using local information inside sentences such as sentence position or sentence length, our method judges the importance of a sentence based on its own information and the relation between sentences. For the relation between sentences, we utilize textual entailment, a relationship indicating that the meaning of a sentence can be inferred from another one. Unlike previous work on using textual entailment for summarization, we go a step further by looking at aligned words in an entailment sentence pair. Assuming that important words in a salient sentence can be aligned by several words in other sentences, word alignment scores are exploited to compute the entailment score of a sentence. To take advantage of local and neighbor information for facilitating the salient estimation of sentences, we combine entailment scores with sentence position scores. We validate the proposed scoring method with greedy or integer linear programming approaches for extracting summaries. Experiments on three datasets (including DUC 2001 and 2002) in two different domains show that our model obtains competitive ROUGE-scores with state-of-the-art methods for single-document summarization.","PeriodicalId":448008,"journal":{"name":"Proceedings of the 9th International Symposium on Information and Communication Technology","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Entailment-based Scoring Method for Content Selection in Document Summarization\",\"authors\":\"Dang Hoang Long, Minh-Tien Nguyen, Ngo Xuan Bach, Le-Minh Nguyen, Tu Minh Phuong\",\"doi\":\"10.1145/3287921.3287976\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces a scoring method to improve the quality of content selection in an extractive summarization system. Different from previous models mainly using local information inside sentences such as sentence position or sentence length, our method judges the importance of a sentence based on its own information and the relation between sentences. For the relation between sentences, we utilize textual entailment, a relationship indicating that the meaning of a sentence can be inferred from another one. Unlike previous work on using textual entailment for summarization, we go a step further by looking at aligned words in an entailment sentence pair. Assuming that important words in a salient sentence can be aligned by several words in other sentences, word alignment scores are exploited to compute the entailment score of a sentence. To take advantage of local and neighbor information for facilitating the salient estimation of sentences, we combine entailment scores with sentence position scores. We validate the proposed scoring method with greedy or integer linear programming approaches for extracting summaries. Experiments on three datasets (including DUC 2001 and 2002) in two different domains show that our model obtains competitive ROUGE-scores with state-of-the-art methods for single-document summarization.\",\"PeriodicalId\":448008,\"journal\":{\"name\":\"Proceedings of the 9th International Symposium on Information and Communication Technology\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 9th International Symposium on Information and Communication Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3287921.3287976\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3287921.3287976","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文介绍了一种提高抽取摘要系统内容选择质量的评分方法。与以往的模型主要利用句子内部的局部信息(如句子位置或句子长度)来判断句子的重要性不同，我们的方法是根据句子本身的信息和句子之间的关系来判断句子的重要性。对于句子之间的关系，我们使用文本蕴涵，这是一种表明句子的意义可以从另一个句子中推断出来的关系。与之前使用文本蕴涵进行摘要的工作不同，我们进一步研究了蕴涵句子对中的对齐词。假设一个重要句子中的重要单词可以被其他句子中的几个单词对齐，那么利用单词对齐分数来计算一个句子的蕴涵分数。为了利用局部和邻近信息方便句子的显著性估计，我们将蕴涵分数与句子位置分数相结合。我们用贪婪或整数线性规划方法验证了所提出的评分方法用于提取摘要。在两个不同领域的三个数据集(包括DUC 2001和2002)上进行的实验表明，我们的模型使用最先进的单文档摘要方法获得了具有竞争力的rouge分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Entailment-based Scoring Method for Content Selection in Document Summarization

This paper introduces a scoring method to improve the quality of content selection in an extractive summarization system. Different from previous models mainly using local information inside sentences such as sentence position or sentence length, our method judges the importance of a sentence based on its own information and the relation between sentences. For the relation between sentences, we utilize textual entailment, a relationship indicating that the meaning of a sentence can be inferred from another one. Unlike previous work on using textual entailment for summarization, we go a step further by looking at aligned words in an entailment sentence pair. Assuming that important words in a salient sentence can be aligned by several words in other sentences, word alignment scores are exploited to compute the entailment score of a sentence. To take advantage of local and neighbor information for facilitating the salient estimation of sentences, we combine entailment scores with sentence position scores. We validate the proposed scoring method with greedy or integer linear programming approaches for extracting summaries. Experiments on three datasets (including DUC 2001 and 2002) in two different domains show that our model obtains competitive ROUGE-scores with state-of-the-art methods for single-document summarization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 9th International Symposium on Information and Communication Technology

自引率

0.00%

发文量