{"title":"Multidocument Summarization using GloVe Word Embedding and Agglomerative Cluster Methods","authors":"R. Rosalina, Rafiqul Huda, Genta Sahuri","doi":"10.1109/ICSECC51444.2020.9557393","DOIUrl":null,"url":null,"abstract":"This paper explores the method of extracting multi-document summary terminology that enhances single document summary methods using the information related to the document and perhaps even the relation between the documents. In the problems of fragmentation, density, duplication and selection of passages, the summarization of several documents ranges from the summarization of single documents to the creation of efficient summaries. Our approach addresses these issues by using Agglomerative cluster sentence, GloVe, TextRank, Cosine Similarity. In addition, this research also use NLTK as library filter word such as stopwords, numeric, punctuation, multiple_whitespaces, short_words in order Vectorizing the sentence when using GloVe. The result of this paper was evaluated using ROUGE; 41% for unigram, 17% for bigram, 57% for trigram.","PeriodicalId":302689,"journal":{"name":"2020 IEEE International Conference on Sustainable Engineering and Creative Computing (ICSECC)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Sustainable Engineering and Creative Computing (ICSECC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSECC51444.2020.9557393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper explores the method of extracting multi-document summary terminology that enhances single document summary methods using the information related to the document and perhaps even the relation between the documents. In the problems of fragmentation, density, duplication and selection of passages, the summarization of several documents ranges from the summarization of single documents to the creation of efficient summaries. Our approach addresses these issues by using Agglomerative cluster sentence, GloVe, TextRank, Cosine Similarity. In addition, this research also use NLTK as library filter word such as stopwords, numeric, punctuation, multiple_whitespaces, short_words in order Vectorizing the sentence when using GloVe. The result of this paper was evaluated using ROUGE; 41% for unigram, 17% for bigram, 57% for trigram.