{"title":"Validation of Graph Based K Nearest Neighbor for Summarizing News Articles","authors":"T. Jo","doi":"10.1109/ICGHIT.2019.00022","DOIUrl":null,"url":null,"abstract":"This research proposes the text summarization tool based on a machine learning algorithm which is the modified KNN version which classifies a graph into summary or non-summary. The motivations of this research are the three facts: one fact is that a graph is a visualize representation of data items, another fact is that various similarity metrics among graphs are defined and the other is that the text summarization is able to be viewed into a classification task which a machine algorithm is applicable. The proposed system partitions a text into paragraphs, encode them into graphs in each of which vertices are words and edges are semantic relations between words, and applies the modified KNN version to the text summarization. The proposed approach is empirically validated as the better one, in summarizing news articles domain by domain. We need to consider the domain granularity and pre-classification of each full text into a domain for implementing the text summarization systems.","PeriodicalId":160708,"journal":{"name":"2019 International Conference on Green and Human Information Technology (ICGHIT)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Green and Human Information Technology (ICGHIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICGHIT.2019.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This research proposes the text summarization tool based on a machine learning algorithm which is the modified KNN version which classifies a graph into summary or non-summary. The motivations of this research are the three facts: one fact is that a graph is a visualize representation of data items, another fact is that various similarity metrics among graphs are defined and the other is that the text summarization is able to be viewed into a classification task which a machine algorithm is applicable. The proposed system partitions a text into paragraphs, encode them into graphs in each of which vertices are words and edges are semantic relations between words, and applies the modified KNN version to the text summarization. The proposed approach is empirically validated as the better one, in summarizing news articles domain by domain. We need to consider the domain granularity and pre-classification of each full text into a domain for implementing the text summarization systems.