{"title":"Improving Salience-Based Multi-Document Summarization Performance using a Hybrid Sentence Similarity Measure","authors":"Kamal Sarkar, S. Chowdhury","doi":"10.5121/csit.2024.140202","DOIUrl":null,"url":null,"abstract":"The process of creating a single summary from a group of related text documents obtained from many sources is known as multi-document summarization. The efficacy of a multidocument summarization system is heavily reliant upon the sentence similarity metric employed to eliminate redundant sentences from the summary, given that the documents may contain redundant information. The sentence similarity measure is also crucial for a graph-based multi-document summarization, where the presence of an edge between two phrases is decided by how similar the two sentences are to one another. To enhance multi-document summarization performance, this study provides a new method for defining a hybrid sentence similarity measure combining a lexical similarity measure and a BERT-based semantic similarity measure. Tests conducted on the benchmark datasets demonstrate how well the proposed hybrid sentence similarity metric is effective for enhancing multi-document summarization performance.","PeriodicalId":104179,"journal":{"name":"AI, Machine Learning and Applications","volume":"11 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI, Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/csit.2024.140202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The process of creating a single summary from a group of related text documents obtained from many sources is known as multi-document summarization. The efficacy of a multidocument summarization system is heavily reliant upon the sentence similarity metric employed to eliminate redundant sentences from the summary, given that the documents may contain redundant information. The sentence similarity measure is also crucial for a graph-based multi-document summarization, where the presence of an edge between two phrases is decided by how similar the two sentences are to one another. To enhance multi-document summarization performance, this study provides a new method for defining a hybrid sentence similarity measure combining a lexical similarity measure and a BERT-based semantic similarity measure. Tests conducted on the benchmark datasets demonstrate how well the proposed hybrid sentence similarity metric is effective for enhancing multi-document summarization performance.