{"title":"Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources","authors":"Muhidin A. Mohamed, M. Oussalah","doi":"10.1109/Trustcom.2015.565","DOIUrl":null,"url":null,"abstract":"In this paper we present an approach for an extractive query focused multi-document summarization which stands on an enhanced knowledge-based short text semantic similarity measures. We incorporate WordNet Taxonomy with Categorial Variation Database (CatVar) and Morphosemantic Links to determine query similarity with sentences and intra-sentences similarities. Besides, we enrich WordNet-derived similarity with named entity semantic relatedness inferred from Wikipedia and underpinned by Normalized Google Distance. We show that our summarizer built primarily on such an improved semantic similarity measure to model relevance, centrality and diversity factors outperforms the best-performing relevant DUC systems and recent closely related studies in at least one or more of the investigated ROUGE metrics. An anti-redundancy mechanism is augmented with the proposed summarizer design using Maximum Marginal Relevance algorithm -MMR.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"152 11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Trustcom/BigDataSE/ISPA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/Trustcom.2015.565","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
In this paper we present an approach for an extractive query focused multi-document summarization which stands on an enhanced knowledge-based short text semantic similarity measures. We incorporate WordNet Taxonomy with Categorial Variation Database (CatVar) and Morphosemantic Links to determine query similarity with sentences and intra-sentences similarities. Besides, we enrich WordNet-derived similarity with named entity semantic relatedness inferred from Wikipedia and underpinned by Normalized Google Distance. We show that our summarizer built primarily on such an improved semantic similarity measure to model relevance, centrality and diversity factors outperforms the best-performing relevant DUC systems and recent closely related studies in at least one or more of the investigated ROUGE metrics. An anti-redundancy mechanism is augmented with the proposed summarizer design using Maximum Marginal Relevance algorithm -MMR.