Analyzing common lexical features of fake news using multi-head attention weights

IF 6 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Internet of Things Pub Date : 2024-10-24 DOI:10.1016/j.iot.2024.101409

Mamoru Mimura , Takayuki Ishimaru

{"title":"Analyzing common lexical features of fake news using multi-head attention weights","authors":"Mamoru Mimura , Takayuki Ishimaru","doi":"10.1016/j.iot.2024.101409","DOIUrl":null,"url":null,"abstract":"<div><div>Numerous approaches have been developed to identify fake news through machine learning; however, these methods are predominantly assessed using singular datasets specific to certain fields, leading to a scarcity of research on versatile models adaptable to a range of domains. This study evaluates the adaptability of a fake news detection model across diverse fields, employing three distinct datasets. Furthermore, the study leverages the multi-head attention feature of bidirectional encoder representations from transformers (BERT) to scrutinize the feature extraction process in the model. In our analysis, we focused on words that are commonly emphasized by machine learning in fake news detection. The dataset comprised 27,442 instances of genuine news and 28,359 instances of fabricated news, each distinctly labeled. To examine the focal words, we utilized multi-head attention, a component of BERT. This mechanism assigns greater weight to words that receive more attention. Our investigation aimed to identify which words were assigned higher weights in each article. The findings indicate that while representing a minor percentage, a common characteristic of fake news is the heightened attention to words that influence the credibility of the article. To assess the versatility of the model, we applied the model trained on one dataset to classify other datasets. The results demonstrate a notable decline in accuracy, attributable to the distinctive characteristics of the training data. These observations suggest that common features among fake news, which could be extracted using the fine-tuned BERT model, are limited.</div></div>","PeriodicalId":29968,"journal":{"name":"Internet of Things","volume":"28 ","pages":"Article 101409"},"PeriodicalIF":6.0000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2542660524003500","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Numerous approaches have been developed to identify fake news through machine learning; however, these methods are predominantly assessed using singular datasets specific to certain fields, leading to a scarcity of research on versatile models adaptable to a range of domains. This study evaluates the adaptability of a fake news detection model across diverse fields, employing three distinct datasets. Furthermore, the study leverages the multi-head attention feature of bidirectional encoder representations from transformers (BERT) to scrutinize the feature extraction process in the model. In our analysis, we focused on words that are commonly emphasized by machine learning in fake news detection. The dataset comprised 27,442 instances of genuine news and 28,359 instances of fabricated news, each distinctly labeled. To examine the focal words, we utilized multi-head attention, a component of BERT. This mechanism assigns greater weight to words that receive more attention. Our investigation aimed to identify which words were assigned higher weights in each article. The findings indicate that while representing a minor percentage, a common characteristic of fake news is the heightened attention to words that influence the credibility of the article. To assess the versatility of the model, we applied the model trained on one dataset to classify other datasets. The results demonstrate a notable decline in accuracy, attributable to the distinctive characteristics of the training data. These observations suggest that common features among fake news, which could be extracted using the fine-tuned BERT model, are limited.

查看原文本刊更多论文

利用多头注意力权重分析假新闻的共同词汇特征

通过机器学习识别假新闻的方法层出不穷；然而，这些方法主要是通过特定领域的单一数据集进行评估的，导致有关可适应一系列领域的通用模型的研究十分匮乏。本研究采用三个不同的数据集，评估了假新闻检测模型在不同领域的适应性。此外，本研究还利用变换器双向编码器表征（BERT）的多头注意力特征，对模型中的特征提取过程进行了仔细检查。在分析中，我们重点关注机器学习在假新闻检测中通常会强调的词语。数据集包括 27,442 个真实新闻实例和 28,359 个虚假新闻实例，每个实例都有不同的标签。为了检测焦点词，我们使用了多头注意力，这是 BERT 的一个组成部分。这一机制为受到更多关注的词语分配了更大的权重。我们的调查旨在确定每篇文章中哪些词语被赋予了更高的权重。调查结果表明，假新闻的一个共同特征是对影响文章可信度的词语的关注度提高，虽然所占比例很小。为了评估模型的通用性，我们将在一个数据集上训练的模型应用于其他数据集的分类。结果表明，由于训练数据的独特性，准确率明显下降。这些观察结果表明，使用微调 BERT 模型提取的假新闻共同特征是有限的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Internet of Things Multiple-

CiteScore

3.60

自引率

5.10%

发文量

115

审稿时长

37 days

期刊介绍： Internet of Things; Engineering Cyber Physical Human Systems is a comprehensive journal encouraging cross collaboration between researchers, engineers and practitioners in the field of IoT & Cyber Physical Human Systems. The journal offers a unique platform to exchange scientific information on the entire breadth of technology, science, and societal applications of the IoT. The journal will place a high priority on timely publication, and provide a home for high quality. Furthermore, IOT is interested in publishing topical Special Issues on any aspect of IOT.