{"title":"A Novel Named Entity Recognition approach of Indonesian fake news using part of speech and BERT model on presidential election","authors":"Puji Winar Cahyo , Ulfi Saidata Aesyi , Widodo Agus Setianto , Tatang Sulaiman","doi":"10.1016/j.jjimei.2025.100354","DOIUrl":null,"url":null,"abstract":"<div><div>Fake news often spreads rapidly and can mislead readers, which makes it important to approach such information with caution. In text-based information, content extraction can be used to determine the meaning and intent of the message. Therefore, this research aims to develop a novel approach for entity detection in Indonesian-language fake news texts by applying BiLSTM-CRF, BiGRU, and BERT models. The novelty of this study lies in the integration of Part-of-Speech (PoS) tagging before processing words for entity detection. Words tagged as Noun (NN) and Proper Noun (NNP) are transformed into entity labels such as ORG for organizations, PER for people, and LOC for locations. Meanwhile, words labeled as Verb (VB) are converted into the ACT entity to represent actions. Evaluations were conducted by integrating PoS tagging with entity detection using the BiLSTM-CRF model, which achieved an F1-Score of 81.26%. The BiGRU-based model achieved an F1-Score of 79.46%, while the BERT-based model achieved the highest F1-Score of 87.38%. These results demonstrate that the BERT model, when combined with PoS tagging, provides the best performance and can effectively be used to detect entities in fake news. The entity detection process was further applied to identify fake news during the 2024 Indonesian presidential and vice-presidential election period. By counting the number of mentions of each candidate and their running mate labeled as PER entities, it has result the Prabowo Subianto–Gibran Rakabuming Raka pair appeared in 49 fake news articles. This was followed by the Ganjar Pranowo–Mahfud MD pair with 14 fake news articles, and the Anies Baswedan–Muhaimin Iskandar pair with 13 articles. All identified data have been filtered to retain only unique entries.</div></div>","PeriodicalId":100699,"journal":{"name":"International Journal of Information Management Data Insights","volume":"5 2","pages":"Article 100354"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Management Data Insights","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667096825000369","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Fake news often spreads rapidly and can mislead readers, which makes it important to approach such information with caution. In text-based information, content extraction can be used to determine the meaning and intent of the message. Therefore, this research aims to develop a novel approach for entity detection in Indonesian-language fake news texts by applying BiLSTM-CRF, BiGRU, and BERT models. The novelty of this study lies in the integration of Part-of-Speech (PoS) tagging before processing words for entity detection. Words tagged as Noun (NN) and Proper Noun (NNP) are transformed into entity labels such as ORG for organizations, PER for people, and LOC for locations. Meanwhile, words labeled as Verb (VB) are converted into the ACT entity to represent actions. Evaluations were conducted by integrating PoS tagging with entity detection using the BiLSTM-CRF model, which achieved an F1-Score of 81.26%. The BiGRU-based model achieved an F1-Score of 79.46%, while the BERT-based model achieved the highest F1-Score of 87.38%. These results demonstrate that the BERT model, when combined with PoS tagging, provides the best performance and can effectively be used to detect entities in fake news. The entity detection process was further applied to identify fake news during the 2024 Indonesian presidential and vice-presidential election period. By counting the number of mentions of each candidate and their running mate labeled as PER entities, it has result the Prabowo Subianto–Gibran Rakabuming Raka pair appeared in 49 fake news articles. This was followed by the Ganjar Pranowo–Mahfud MD pair with 14 fake news articles, and the Anies Baswedan–Muhaimin Iskandar pair with 13 articles. All identified data have been filtered to retain only unique entries.