{"title":"Rank-Aware Gain-Based Evaluation of Extractive Summarization","authors":"Mousumi Akter","doi":"10.1145/3511808.3557821","DOIUrl":null,"url":null,"abstract":"ROUGE has long been a popular metric for evaluating text summarization tasks as it eliminates time-consuming and costly human evaluations. However, ROUGE is not a fair evaluation metric for extractive summarization task as it is entirely based on lexical overlap. Additionally, ROUGE ignores the quality of the ranker for extractive summarization which performs the actual sentence/phrase extraction job. The main focus of the thesis is to design a nCG (normalized cumulative gain)-based evaluation metric for extractive summarization that is both rank-aware and semantic-aware (called Sem-nCG). One fundamental contribution of the work is that it demonstrates how we can generate more reliable semantic-aware ground truths for evaluating extractive summarization tasks without any additional human intervention. To the best of our knowledge, this work is the first of its kind. Preliminary experimental results demonstrate that the new Sem-nCG metric is indeed semantic-aware and also exhibits higher correlation with human judgement for single document summarization when single reference is considered.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3511808.3557821","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
ROUGE has long been a popular metric for evaluating text summarization tasks as it eliminates time-consuming and costly human evaluations. However, ROUGE is not a fair evaluation metric for extractive summarization task as it is entirely based on lexical overlap. Additionally, ROUGE ignores the quality of the ranker for extractive summarization which performs the actual sentence/phrase extraction job. The main focus of the thesis is to design a nCG (normalized cumulative gain)-based evaluation metric for extractive summarization that is both rank-aware and semantic-aware (called Sem-nCG). One fundamental contribution of the work is that it demonstrates how we can generate more reliable semantic-aware ground truths for evaluating extractive summarization tasks without any additional human intervention. To the best of our knowledge, this work is the first of its kind. Preliminary experimental results demonstrate that the new Sem-nCG metric is indeed semantic-aware and also exhibits higher correlation with human judgement for single document summarization when single reference is considered.