Implementation of Textrank Algorithm in Product Review Summarization

2020 4th International Conference on Informatics and Computational Sciences (ICICoS) Pub Date : 2020-11-10 DOI:10.1109/ICICoS51170.2020.9299005

M. R. Ramadhan, S. Endah, Aprinaldi Jasa Mantau

{"title":"Implementation of Textrank Algorithm in Product Review Summarization","authors":"M. R. Ramadhan, S. Endah, Aprinaldi Jasa Mantau","doi":"10.1109/ICICoS51170.2020.9299005","DOIUrl":null,"url":null,"abstract":"Internet technology led to the emergence of Web 2.0 which increase the number of User Generated Content (UGC) on the network. Online product review is a form of UGC. The case study in this research is a review of handphone products. The large number of reviews will take long time to read and compare between existing product reviews, so we need a technique that online product review can be read quickly without losing of its important information. The technique that can be used is the text summarizing technique. Text summarization techniques produce simplified versions of texts. In general, text summarization can be divided into two types, namely extractive and abstractive summaries. This research used extractive summaries. One important component in the process of obtaining an extractive summary is sentence extraction. In this study, the algorithm used for sentence extraction is TextRank. The purpose of this study was to determine the performance of the TextRank algorithm with handphone product reviews data by implementing it in different data conditions based on the presence or absence of a stopword and typo. These data conditions are used to formulate test scenarios. Testing is done by calculating the Rouge-1 value which compares the summary of system and experts. Expert who involved in this study are 2. Expert 1 is a person with expertise in Indonesian and Expert 2 is someone who has the knowledge and understanding of mobile phones with various types and characteristics. From the test results obtained, Expert 1 gets the best results for scenario 2 where data conditions are there is typo and no stopword with an average value of Rouge-1 of 42.29% and Expert 2 gets the best results for scenario 3 where data conditions are no typo and there is stopword with an average value of Rouge-1 is 46.71%. The results shows that the TextRank algorithm is not able to produce a good summary for handphone product review dataset.","PeriodicalId":122803,"journal":{"name":"2020 4th International Conference on Informatics and Computational Sciences (ICICoS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 4th International Conference on Informatics and Computational Sciences (ICICoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICoS51170.2020.9299005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Internet technology led to the emergence of Web 2.0 which increase the number of User Generated Content (UGC) on the network. Online product review is a form of UGC. The case study in this research is a review of handphone products. The large number of reviews will take long time to read and compare between existing product reviews, so we need a technique that online product review can be read quickly without losing of its important information. The technique that can be used is the text summarizing technique. Text summarization techniques produce simplified versions of texts. In general, text summarization can be divided into two types, namely extractive and abstractive summaries. This research used extractive summaries. One important component in the process of obtaining an extractive summary is sentence extraction. In this study, the algorithm used for sentence extraction is TextRank. The purpose of this study was to determine the performance of the TextRank algorithm with handphone product reviews data by implementing it in different data conditions based on the presence or absence of a stopword and typo. These data conditions are used to formulate test scenarios. Testing is done by calculating the Rouge-1 value which compares the summary of system and experts. Expert who involved in this study are 2. Expert 1 is a person with expertise in Indonesian and Expert 2 is someone who has the knowledge and understanding of mobile phones with various types and characteristics. From the test results obtained, Expert 1 gets the best results for scenario 2 where data conditions are there is typo and no stopword with an average value of Rouge-1 of 42.29% and Expert 2 gets the best results for scenario 3 where data conditions are no typo and there is stopword with an average value of Rouge-1 is 46.71%. The results shows that the TextRank algorithm is not able to produce a good summary for handphone product review dataset.

查看原文本刊更多论文

Textrank算法在产品评论摘要中的实现

互联网技术导致了Web 2.0的出现，增加了网络上用户生成内容(UGC)的数量。在线产品评论是UGC的一种形式。本研究的个案研究是对手机产品的回顾。大量的评论需要很长的时间来阅读和比较现有的产品评论，所以我们需要一种技术，可以快速阅读在线产品评论而不丢失其重要信息。可以使用的技巧是文本总结技巧。文本摘要技术产生文本的简化版本。一般来说，文本摘要可以分为两种类型，即抽取摘要和抽象摘要。本研究采用摘录摘要。摘要提取的一个重要环节是句子提取。在本研究中，句子提取使用的算法是TextRank。本研究的目的是确定手机产品评论数据的TextRank算法的性能，通过在不同的数据条件下基于是否存在停顿词和错别字来实现它。这些数据条件用于制定测试场景。通过计算Rouge-1值来比较系统和专家的总结，从而完成测试。参与本研究的专家有2位。专家1是精通印尼语的人，专家2是对各种类型和特点的手机有一定的认识和了解的人。从得到的测试结果来看，Expert 1在数据条件为有错字、无停词的场景2中获得的结果最好，Rouge-1的平均值为42.29%;Expert 2在数据条件为无错字、有停词的场景3中获得的结果最好，Rouge-1的平均值为46.71%。结果表明，TextRank算法不能很好地生成手机产品评论数据集的摘要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 4th International Conference on Informatics and Computational Sciences (ICICoS)

自引率

0.00%

发文量