{"title":"网络媒体作为价格监视器:使用文本提取技术和Jaro-Winkler相似算法的文本分析","authors":"Vivine Nurcahyawati, Z. Mustaffa","doi":"10.1109/ETCCE51779.2020.9350898","DOIUrl":null,"url":null,"abstract":"Online media has become an essential part of everyday life in modern society. Everyone or organization is free to share their opinions and feelings about any topic on it, including information or news about commodity price fluctuations. Commodity price data from the National Strategic Price Information Center (NSPIC) website is not real-time, so it is not sufficient as a basis for monitoring commodity price fluctuations. Meanwhile, the government needs to collect data and infor-mation quickly about these price fluctuations, hence immediately strategic decisions and policies can be made to stabilize the prices. This study explores the potential function of online media by extracting the text in it and analyzing text so that it can display the commodity price data sought. The commodities used as search keywords were com-modities that had the highest consumption level in 2016 in Indonesia. The texts analyzed were taken from three online media, namely Twit-ter, Liputan6.com, and Detik.com. It was analyzed using text extraction techniques and the application of the Jaro-Winkler algorithm to find commodity prices in the text collection. Then compare the results of text analysis with commodity prices from the NSPIC website. The experimental data were 99,007 with a data collection time of three months. From only 122 data that match the keywords, it consists of 100 training data and 22 testing data. The results of the text analysis show that the text from the Detik.com website shows the commodity prices closest to the price data from the NSPIC, while Twitter shows the farthest results. The accuracy test with the confusion matrix is 75%. Based on this research, online media texts are a viable source for moni-toring commodity price fluctuations.","PeriodicalId":234459,"journal":{"name":"2020 Emerging Technology in Computing, Communication and Electronics (ETCCE)","volume":"240 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Online Media as a Price Monitor: Text Analysis using Text Extraction Technique and Jaro-Winkler Similarity Algorithm\",\"authors\":\"Vivine Nurcahyawati, Z. Mustaffa\",\"doi\":\"10.1109/ETCCE51779.2020.9350898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online media has become an essential part of everyday life in modern society. Everyone or organization is free to share their opinions and feelings about any topic on it, including information or news about commodity price fluctuations. Commodity price data from the National Strategic Price Information Center (NSPIC) website is not real-time, so it is not sufficient as a basis for monitoring commodity price fluctuations. Meanwhile, the government needs to collect data and infor-mation quickly about these price fluctuations, hence immediately strategic decisions and policies can be made to stabilize the prices. This study explores the potential function of online media by extracting the text in it and analyzing text so that it can display the commodity price data sought. The commodities used as search keywords were com-modities that had the highest consumption level in 2016 in Indonesia. The texts analyzed were taken from three online media, namely Twit-ter, Liputan6.com, and Detik.com. It was analyzed using text extraction techniques and the application of the Jaro-Winkler algorithm to find commodity prices in the text collection. Then compare the results of text analysis with commodity prices from the NSPIC website. The experimental data were 99,007 with a data collection time of three months. From only 122 data that match the keywords, it consists of 100 training data and 22 testing data. The results of the text analysis show that the text from the Detik.com website shows the commodity prices closest to the price data from the NSPIC, while Twitter shows the farthest results. The accuracy test with the confusion matrix is 75%. Based on this research, online media texts are a viable source for moni-toring commodity price fluctuations.\",\"PeriodicalId\":234459,\"journal\":{\"name\":\"2020 Emerging Technology in Computing, Communication and Electronics (ETCCE)\",\"volume\":\"240 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Emerging Technology in Computing, Communication and Electronics (ETCCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ETCCE51779.2020.9350898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Emerging Technology in Computing, Communication and Electronics (ETCCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ETCCE51779.2020.9350898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Online Media as a Price Monitor: Text Analysis using Text Extraction Technique and Jaro-Winkler Similarity Algorithm
Online media has become an essential part of everyday life in modern society. Everyone or organization is free to share their opinions and feelings about any topic on it, including information or news about commodity price fluctuations. Commodity price data from the National Strategic Price Information Center (NSPIC) website is not real-time, so it is not sufficient as a basis for monitoring commodity price fluctuations. Meanwhile, the government needs to collect data and infor-mation quickly about these price fluctuations, hence immediately strategic decisions and policies can be made to stabilize the prices. This study explores the potential function of online media by extracting the text in it and analyzing text so that it can display the commodity price data sought. The commodities used as search keywords were com-modities that had the highest consumption level in 2016 in Indonesia. The texts analyzed were taken from three online media, namely Twit-ter, Liputan6.com, and Detik.com. It was analyzed using text extraction techniques and the application of the Jaro-Winkler algorithm to find commodity prices in the text collection. Then compare the results of text analysis with commodity prices from the NSPIC website. The experimental data were 99,007 with a data collection time of three months. From only 122 data that match the keywords, it consists of 100 training data and 22 testing data. The results of the text analysis show that the text from the Detik.com website shows the commodity prices closest to the price data from the NSPIC, while Twitter shows the farthest results. The accuracy test with the confusion matrix is 75%. Based on this research, online media texts are a viable source for moni-toring commodity price fluctuations.