LSTM和IndoBERT方法在Twitter上识别恶作剧的比较

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Pub Date : 2023-06-02 DOI:10.29207/resti.v7i3.4830

Muhammad Ikram Kaer Sinapoy, Yuliant sibaroni, Sri Suryani Prasetyowati

{"title":"LSTM和IndoBERT方法在Twitter上识别恶作剧的比较","authors":"Muhammad Ikram Kaer Sinapoy, Yuliant sibaroni, Sri Suryani Prasetyowati","doi":"10.29207/resti.v7i3.4830","DOIUrl":null,"url":null,"abstract":"In recent years, social media users have been increasing significantly, in January 2022 social media users in Indonesia reached 191 million people which has an increase of 12.35% from the previous year as many as 170 million people, With this massive increase every year, more and more people tend to seek and consume information through social media. Despite the many advantages provided by social media, However, the quality of information on social media is lower than in traditional news media there is a lot of hoax information spreading. With many disadvantages felt by hoax information, it has led to many research to detect hoax information on social media, especially information that is widely spread on Twitter. There are several previous researches that use various models using machine learning and also using deep learning to detect hoax. deep learning is very well used to perform several text classification tasks, especially in detecting hoax. The aim of this paper is to compare the LSTM and IndoBERT methods in detecting hoax using datasets taken from Twitter. In this study, two experiments work are conducted, LSTM and IndoBERT methods. The experimental results is average value obtained from experiments using 10-fold cross-validation. The IndoBERT model shows good performance with an average accuracy value of 92.07%, and the LSTM model provides an average accuracy value of 87.54%. The IndoBERT model can show good performance in hoax detection tasks and is shown to outperform the LSTM model which can provide the best average accuracy results in this study.","PeriodicalId":435683,"journal":{"name":"Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparison of LSTM and IndoBERT Method in Identifying Hoax on Twitter\",\"authors\":\"Muhammad Ikram Kaer Sinapoy, Yuliant sibaroni, Sri Suryani Prasetyowati\",\"doi\":\"10.29207/resti.v7i3.4830\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, social media users have been increasing significantly, in January 2022 social media users in Indonesia reached 191 million people which has an increase of 12.35% from the previous year as many as 170 million people, With this massive increase every year, more and more people tend to seek and consume information through social media. Despite the many advantages provided by social media, However, the quality of information on social media is lower than in traditional news media there is a lot of hoax information spreading. With many disadvantages felt by hoax information, it has led to many research to detect hoax information on social media, especially information that is widely spread on Twitter. There are several previous researches that use various models using machine learning and also using deep learning to detect hoax. deep learning is very well used to perform several text classification tasks, especially in detecting hoax. The aim of this paper is to compare the LSTM and IndoBERT methods in detecting hoax using datasets taken from Twitter. In this study, two experiments work are conducted, LSTM and IndoBERT methods. The experimental results is average value obtained from experiments using 10-fold cross-validation. The IndoBERT model shows good performance with an average accuracy value of 92.07%, and the LSTM model provides an average accuracy value of 87.54%. The IndoBERT model can show good performance in hoax detection tasks and is shown to outperform the LSTM model which can provide the best average accuracy results in this study.\",\"PeriodicalId\":435683,\"journal\":{\"name\":\"Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29207/resti.v7i3.4830\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29207/resti.v7i3.4830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，社交媒体用户一直在显著增加，在2022年1月，印度尼西亚的社交媒体用户达到1.91亿人，比上一年增长12.35%，达到1.7亿人，随着每年的大规模增长，越来越多的人倾向于通过社交媒体寻求和消费信息。尽管社交媒体提供了许多优势，但是，社交媒体上的信息质量比传统新闻媒体低，存在大量的虚假信息传播。由于恶作剧信息有很多缺点，因此人们对社交媒体上的恶作剧信息，特别是在Twitter上广泛传播的信息进行了很多研究。以前有一些研究使用各种模型，使用机器学习和深度学习来检测骗局。深度学习可以很好地用于执行一些文本分类任务，特别是在检测恶作剧方面。本文的目的是比较LSTM和IndoBERT方法在使用来自Twitter的数据集检测恶作剧方面的效果。本研究进行了LSTM和IndoBERT两种实验工作。实验结果为10倍交叉验证实验所得的平均值。IndoBERT模型的平均准确率为92.07%，LSTM模型的平均准确率为87.54%。IndoBERT模型在恶作剧检测任务中表现出良好的性能，并且在本研究中被证明优于LSTM模型，LSTM模型可以提供最好的平均准确率结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparison of LSTM and IndoBERT Method in Identifying Hoax on Twitter

In recent years, social media users have been increasing significantly, in January 2022 social media users in Indonesia reached 191 million people which has an increase of 12.35% from the previous year as many as 170 million people, With this massive increase every year, more and more people tend to seek and consume information through social media. Despite the many advantages provided by social media, However, the quality of information on social media is lower than in traditional news media there is a lot of hoax information spreading. With many disadvantages felt by hoax information, it has led to many research to detect hoax information on social media, especially information that is widely spread on Twitter. There are several previous researches that use various models using machine learning and also using deep learning to detect hoax. deep learning is very well used to perform several text classification tasks, especially in detecting hoax. The aim of this paper is to compare the LSTM and IndoBERT methods in detecting hoax using datasets taken from Twitter. In this study, two experiments work are conducted, LSTM and IndoBERT methods. The experimental results is average value obtained from experiments using 10-fold cross-validation. The IndoBERT model shows good performance with an average accuracy value of 92.07%, and the LSTM model provides an average accuracy value of 87.54%. The IndoBERT model can show good performance in hoax detection tasks and is shown to outperform the LSTM model which can provide the best average accuracy results in this study.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)

自引率

0.00%

发文量