{"title":"Heuristics based automatic text summarization of unstructured text","authors":"M. K. Dalal, M. Zaveri","doi":"10.1145/1980022.1980170","DOIUrl":null,"url":null,"abstract":"Automatic Text Summarization is a specialized text mining task of generating a summary or abstract from single or multiple input text documents. Various heuristic and semi-supervised learning methods have been explored by researchers in this field to generate generic as well as user-oriented summaries. This paper examines the effectiveness of well-known summarization heuristics when applied to the task of generating single-document summary extracts of variable length. For evaluating the quality of the summaries, the original text documents and their summaries were scored by different human judges based on soft metrics like topic-coverage, relative coherence, novelty and information content; and their scores were statistically compared. It was experimentally verified that in 65% of the documents there was less than 10% variance between the scores assigned to the original texts and their summaries.","PeriodicalId":197580,"journal":{"name":"International Conference & Workshop on Emerging Trends in Technology","volume":"1 5","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference & Workshop on Emerging Trends in Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1980022.1980170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Automatic Text Summarization is a specialized text mining task of generating a summary or abstract from single or multiple input text documents. Various heuristic and semi-supervised learning methods have been explored by researchers in this field to generate generic as well as user-oriented summaries. This paper examines the effectiveness of well-known summarization heuristics when applied to the task of generating single-document summary extracts of variable length. For evaluating the quality of the summaries, the original text documents and their summaries were scored by different human judges based on soft metrics like topic-coverage, relative coherence, novelty and information content; and their scores were statistically compared. It was experimentally verified that in 65% of the documents there was less than 10% variance between the scores assigned to the original texts and their summaries.