{"title":"ROUGE-SS: A New ROUGE Variant for the Evaluation of Text\nSummarization","authors":"Sandeep Kumar, Arun Solanki, NZ Jhanjhi","doi":"10.2174/0126662558304595240528111535","DOIUrl":null,"url":null,"abstract":"\n\nPrior research on abstractive text summarization has predominantly\nrelied on the ROUGE evaluation metric, which, while effective, has limitations in capturing\nsemantic meaning due to its focus on exact word or phrase matching. This deficiency is particularly pronounced in abstractive summarization approaches, where the goal is to generate novel summaries by rephrasing and paraphrasing the source text, highlighting the need for a more\nnuanced evaluation metric capable of capturing semantic similarity.\n\n\n\nIn this study, the limitations of existing ROUGE metrics are addressed by proposing\na novel variant called ROUGE-SS. Unlike traditional ROUGE metrics, ROUGE-SS extends\nbeyond exact word matching to consider synonyms and semantic similarity. Leveraging resources such as the WordNet online dictionary, ROUGE-SS identifies matches between source\ntext and summaries based on both exact word overlaps and semantic context. Experiments are\nconducted to evaluate the performance of ROUGE-SS compared to other ROUGE variants,\nparticularly in assessing abstractive summarization models. The algorithm for the synonym\nfeatures (ROUGE-SS) is also proposed.\n\n\n\nThe experiments demonstrate the superior performance of ROUGE-SS in evaluating\nabstractive text summarization models compared to existing ROUGE variants. ROUGE-SS\nyields higher F1 scores and better overall performance, achieving a significant reduction in\ntraining loss and impressive accuracy. The proposed ROUGE-SS evaluation technique is evaluated in different datasets like CNN/Daily Mail, DUC-2004, Gigawords, and Inshorts News\ndatasets. ROUGE-SS gives better results than other ROUGE variant metrics. The F1-score of\nthe proposed ROUGE-SS metric is improved by an average of 8.8%. These findings underscore the effectiveness of ROUGE-SS in capturing semantic similarity and providing a more\ncomprehensive evaluation metric for abstractive summarization.\n\n\n\nIn conclusion, the introduction of ROUGE-SS represents a significant advancement in the field of abstractive text summarization evaluation. By extending beyond exact\nword matching to incorporate synonyms and semantic context, ROUGE-SS offers researchers\na more effective tool for assessing summarization quality. This study highlights the importance\nof considering semantic meaning in evaluation metrics and provides a promising direction for\nfuture research on abstractive text summarization.\n","PeriodicalId":506582,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":"207 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Computer Science and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0126662558304595240528111535","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Prior research on abstractive text summarization has predominantly
relied on the ROUGE evaluation metric, which, while effective, has limitations in capturing
semantic meaning due to its focus on exact word or phrase matching. This deficiency is particularly pronounced in abstractive summarization approaches, where the goal is to generate novel summaries by rephrasing and paraphrasing the source text, highlighting the need for a more
nuanced evaluation metric capable of capturing semantic similarity.
In this study, the limitations of existing ROUGE metrics are addressed by proposing
a novel variant called ROUGE-SS. Unlike traditional ROUGE metrics, ROUGE-SS extends
beyond exact word matching to consider synonyms and semantic similarity. Leveraging resources such as the WordNet online dictionary, ROUGE-SS identifies matches between source
text and summaries based on both exact word overlaps and semantic context. Experiments are
conducted to evaluate the performance of ROUGE-SS compared to other ROUGE variants,
particularly in assessing abstractive summarization models. The algorithm for the synonym
features (ROUGE-SS) is also proposed.
The experiments demonstrate the superior performance of ROUGE-SS in evaluating
abstractive text summarization models compared to existing ROUGE variants. ROUGE-SS
yields higher F1 scores and better overall performance, achieving a significant reduction in
training loss and impressive accuracy. The proposed ROUGE-SS evaluation technique is evaluated in different datasets like CNN/Daily Mail, DUC-2004, Gigawords, and Inshorts News
datasets. ROUGE-SS gives better results than other ROUGE variant metrics. The F1-score of
the proposed ROUGE-SS metric is improved by an average of 8.8%. These findings underscore the effectiveness of ROUGE-SS in capturing semantic similarity and providing a more
comprehensive evaluation metric for abstractive summarization.
In conclusion, the introduction of ROUGE-SS represents a significant advancement in the field of abstractive text summarization evaluation. By extending beyond exact
word matching to incorporate synonyms and semantic context, ROUGE-SS offers researchers
a more effective tool for assessing summarization quality. This study highlights the importance
of considering semantic meaning in evaluation metrics and provides a promising direction for
future research on abstractive text summarization.