基于注意机制的马来语文本隐情感检测

Nur Azmina Mohamad Zamani, Norhaslinda Kamaruddin
{"title":"基于注意机制的马来语文本隐情感检测","authors":"Nur Azmina Mohamad Zamani, Norhaslinda Kamaruddin","doi":"10.20473/jisebi.9.2.147-160","DOIUrl":null,"url":null,"abstract":"Background: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition, the present sentiment models in the cryptocurrency domain are primarily focused on English with minimal work on Malay language, further complicating problems. Objective: The performance of the sentiment regression model to forecast sentiment scores for Malay news and tweets is examined in this study. Methods: Malay news headlines and tweets on Bitcoin and Ethereum are used as the input. A hybrid Generalized Autoregressive Pretraining for Language Understanding (XLNet) language model in combination with Bidirectional-Gated Recurrent Unit (Bi-GRU) deep learning model is applied in the proposed sentiment regression implementation. The effectiveness of the proposed sentiment regression model is also investigated using the multi-head self-attention mechanism. Then, a comparison analysis using Bidirectional Encoder Representations from Transformers (BERT) is carried out. Results: The experimental results demonstrate that the number of attention heads is vital in improving the XLNet-GRU sentiment model performance. There are slight improvements of 0.03 in the adjusted R2 values with an average MAE of 0.163 (Malay news) and 0.174 (Malay tweets). In addition, an average RMSE of 0.267 and 0.255 were obtained respectively for Malay news and tweets, which show that the proposed XLNet-GRU sentiment model outperforms the BERT sentiment model with lesser prediction errors. Conclusion: The proposed model contributes to predicting sentiment on cryptocurrency. Moreover, this study also introduced two carefully curated Malay corpora, CryptoSentiNews-Malay and CryptoSentiTweets-Malay, which are extracted from news and tweets, respectively. Further works to enhance Malay news and tweets corpora on cryptocurrency-related issues will be expended with implementing the proposed XLNet Bi-GRU deep learning model for greater financial insight. Keywords: Cryptocurrency, Deep learning model, Malay text, Sentiment analysis, Sentiment regression model","PeriodicalId":16185,"journal":{"name":"Journal of Information Systems Engineering and Business Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism\",\"authors\":\"Nur Azmina Mohamad Zamani, Norhaslinda Kamaruddin\",\"doi\":\"10.20473/jisebi.9.2.147-160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition, the present sentiment models in the cryptocurrency domain are primarily focused on English with minimal work on Malay language, further complicating problems. Objective: The performance of the sentiment regression model to forecast sentiment scores for Malay news and tweets is examined in this study. Methods: Malay news headlines and tweets on Bitcoin and Ethereum are used as the input. A hybrid Generalized Autoregressive Pretraining for Language Understanding (XLNet) language model in combination with Bidirectional-Gated Recurrent Unit (Bi-GRU) deep learning model is applied in the proposed sentiment regression implementation. The effectiveness of the proposed sentiment regression model is also investigated using the multi-head self-attention mechanism. Then, a comparison analysis using Bidirectional Encoder Representations from Transformers (BERT) is carried out. Results: The experimental results demonstrate that the number of attention heads is vital in improving the XLNet-GRU sentiment model performance. There are slight improvements of 0.03 in the adjusted R2 values with an average MAE of 0.163 (Malay news) and 0.174 (Malay tweets). In addition, an average RMSE of 0.267 and 0.255 were obtained respectively for Malay news and tweets, which show that the proposed XLNet-GRU sentiment model outperforms the BERT sentiment model with lesser prediction errors. Conclusion: The proposed model contributes to predicting sentiment on cryptocurrency. Moreover, this study also introduced two carefully curated Malay corpora, CryptoSentiNews-Malay and CryptoSentiTweets-Malay, which are extracted from news and tweets, respectively. Further works to enhance Malay news and tweets corpora on cryptocurrency-related issues will be expended with implementing the proposed XLNet Bi-GRU deep learning model for greater financial insight. Keywords: Cryptocurrency, Deep learning model, Malay text, Sentiment analysis, Sentiment regression model\",\"PeriodicalId\":16185,\"journal\":{\"name\":\"Journal of Information Systems Engineering and Business Intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Systems Engineering and Business Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.20473/jisebi.9.2.147-160\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Systems Engineering and Business Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20473/jisebi.9.2.147-160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:由于对加密货币的兴趣增加,新闻和社交媒体上分享了对加密货币相关话题的看法。频繁发布的大量情绪数据使得这些重要问题的数据处理和分析更具挑战性。此外,目前加密货币领域的情感模型主要集中在英语上,对马来语的研究很少,这使问题进一步复杂化。目的:本研究检验了马来语新闻和推文的情绪回归模型预测情绪得分的性能。方法:使用马来语新闻标题和比特币和以太坊上的推文作为输入。提出了一种结合双向门控循环单元(Bi-GRU)深度学习模型的混合广义自回归语言理解预训练(XLNet)语言模型。采用多头自注意机制考察了情绪回归模型的有效性。然后,利用变压器双向编码器表示(BERT)进行对比分析。结果:实验结果表明,注意头的数量对提高XLNet-GRU情感模型的性能至关重要。调整后的R2值略有改善0.03,平均MAE为0.163(马来新闻)和0.174(马来推文)。此外,马来语新闻和推文的平均RMSE分别为0.267和0.255,这表明所提出的XLNet-GRU情绪模型优于BERT情绪模型,预测误差较小。结论:提出的模型有助于预测加密货币的情绪。此外,本研究还引入了两个精心策划的马来语语料库,CryptoSentiNews-Malay和CryptoSentiTweets-Malay,分别从新闻和推文中提取。通过实施拟议的XLNet Bi-GRU深度学习模型,进一步加强与加密货币相关问题的马来语新闻和推文语料库,以获得更大的金融洞察力。关键词:加密货币,深度学习模型,马来语文本,情感分析,情感回归模型
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
Background: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition, the present sentiment models in the cryptocurrency domain are primarily focused on English with minimal work on Malay language, further complicating problems. Objective: The performance of the sentiment regression model to forecast sentiment scores for Malay news and tweets is examined in this study. Methods: Malay news headlines and tweets on Bitcoin and Ethereum are used as the input. A hybrid Generalized Autoregressive Pretraining for Language Understanding (XLNet) language model in combination with Bidirectional-Gated Recurrent Unit (Bi-GRU) deep learning model is applied in the proposed sentiment regression implementation. The effectiveness of the proposed sentiment regression model is also investigated using the multi-head self-attention mechanism. Then, a comparison analysis using Bidirectional Encoder Representations from Transformers (BERT) is carried out. Results: The experimental results demonstrate that the number of attention heads is vital in improving the XLNet-GRU sentiment model performance. There are slight improvements of 0.03 in the adjusted R2 values with an average MAE of 0.163 (Malay news) and 0.174 (Malay tweets). In addition, an average RMSE of 0.267 and 0.255 were obtained respectively for Malay news and tweets, which show that the proposed XLNet-GRU sentiment model outperforms the BERT sentiment model with lesser prediction errors. Conclusion: The proposed model contributes to predicting sentiment on cryptocurrency. Moreover, this study also introduced two carefully curated Malay corpora, CryptoSentiNews-Malay and CryptoSentiTweets-Malay, which are extracted from news and tweets, respectively. Further works to enhance Malay news and tweets corpora on cryptocurrency-related issues will be expended with implementing the proposed XLNet Bi-GRU deep learning model for greater financial insight. Keywords: Cryptocurrency, Deep learning model, Malay text, Sentiment analysis, Sentiment regression model
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
0.30
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信