基于资源变更和方法比较的精确阿拉伯语情感分析系统的构建

IF 12.3 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Ibtissam Touahri
{"title":"基于资源变更和方法比较的精确阿拉伯语情感分析系统的构建","authors":"Ibtissam Touahri","doi":"10.1108/aci-12-2021-0338","DOIUrl":null,"url":null,"abstract":"PurposeThis paper purposed a multi-facet sentiment analysis system.Design/methodology/approachHence, This paper uses multidomain resources to build a sentiment analysis system. The manual lexicon based features that are extracted from the resources are fed into a machine learning classifier to compare their performance afterward. The manual lexicon is replaced with a custom BOW to deal with its time consuming construction. To help the system run faster and make the model interpretable, this will be performed by employing different existing and custom approaches such as term occurrence, information gain, principal component analysis, semantic clustering, and POS tagging filters.FindingsThe proposed system featured by lexicon extraction automation and characteristics size optimization proved its efficiency when applied to multidomain and benchmark datasets by reaching 93.59% accuracy which makes it competitive to the state-of-the-art systems.Originality/valueThe construction of a custom BOW. Optimizing features based on existing and custom feature selection and clustering approaches.","PeriodicalId":37348,"journal":{"name":"Applied Computing and Informatics","volume":" ","pages":""},"PeriodicalIF":12.3000,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The construction of an accurate Arabic sentiment analysis system based on resources alteration and approaches comparison\",\"authors\":\"Ibtissam Touahri\",\"doi\":\"10.1108/aci-12-2021-0338\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"PurposeThis paper purposed a multi-facet sentiment analysis system.Design/methodology/approachHence, This paper uses multidomain resources to build a sentiment analysis system. The manual lexicon based features that are extracted from the resources are fed into a machine learning classifier to compare their performance afterward. The manual lexicon is replaced with a custom BOW to deal with its time consuming construction. To help the system run faster and make the model interpretable, this will be performed by employing different existing and custom approaches such as term occurrence, information gain, principal component analysis, semantic clustering, and POS tagging filters.FindingsThe proposed system featured by lexicon extraction automation and characteristics size optimization proved its efficiency when applied to multidomain and benchmark datasets by reaching 93.59% accuracy which makes it competitive to the state-of-the-art systems.Originality/valueThe construction of a custom BOW. Optimizing features based on existing and custom feature selection and clustering approaches.\",\"PeriodicalId\":37348,\"journal\":{\"name\":\"Applied Computing and Informatics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":12.3000,\"publicationDate\":\"2022-06-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Computing and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1108/aci-12-2021-0338\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/aci-12-2021-0338","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2

摘要

目的本文设计了一个多方面的情绪分析系统。设计/方法论/方法因此,本文利用多领域资源构建了一个情感分析系统。从资源中提取的基于手动词典的特征被馈送到机器学习分类器中,以在之后比较它们的性能。手动词典被一个自定义的BOW取代,以处理其耗时的结构。为了帮助系统更快地运行并使模型具有可解释性,这将通过采用不同的现有和自定义方法来实现,如术语出现、信息获取、主成分分析、语义聚类和POS标记过滤器。发现所提出的系统具有词典提取自动化和特征大小优化的特点,在应用于多域和基准数据集时证明了其效率,准确率达到93.59%,与最先进的系统相比具有竞争力。独创性/价值定制BOW的构建。基于现有和自定义特征选择和聚类方法优化特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The construction of an accurate Arabic sentiment analysis system based on resources alteration and approaches comparison
PurposeThis paper purposed a multi-facet sentiment analysis system.Design/methodology/approachHence, This paper uses multidomain resources to build a sentiment analysis system. The manual lexicon based features that are extracted from the resources are fed into a machine learning classifier to compare their performance afterward. The manual lexicon is replaced with a custom BOW to deal with its time consuming construction. To help the system run faster and make the model interpretable, this will be performed by employing different existing and custom approaches such as term occurrence, information gain, principal component analysis, semantic clustering, and POS tagging filters.FindingsThe proposed system featured by lexicon extraction automation and characteristics size optimization proved its efficiency when applied to multidomain and benchmark datasets by reaching 93.59% accuracy which makes it competitive to the state-of-the-art systems.Originality/valueThe construction of a custom BOW. Optimizing features based on existing and custom feature selection and clustering approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Computing and Informatics
Applied Computing and Informatics Computer Science-Information Systems
CiteScore
12.20
自引率
0.00%
发文量
0
审稿时长
39 weeks
期刊介绍: Applied Computing and Informatics aims to be timely in disseminating leading-edge knowledge to researchers, practitioners and academics whose interest is in the latest developments in applied computing and information systems concepts, strategies, practices, tools and technologies. In particular, the journal encourages research studies that have significant contributions to make to the continuous development and improvement of IT practices in the Kingdom of Saudi Arabia and other countries. By doing so, the journal attempts to bridge the gap between the academic and industrial community, and therefore, welcomes theoretically grounded, methodologically sound research studies that address various IT-related problems and innovations of an applied nature. The journal will serve as a forum for practitioners, researchers, managers and IT policy makers to share their knowledge and experience in the design, development, implementation, management and evaluation of various IT applications. Contributions may deal with, but are not limited to: • Internet and E-Commerce Architecture, Infrastructure, Models, Deployment Strategies and Methodologies. • E-Business and E-Government Adoption. • Mobile Commerce and their Applications. • Applied Telecommunication Networks. • Software Engineering Approaches, Methodologies, Techniques, and Tools. • Applied Data Mining and Warehousing. • Information Strategic Planning and Recourse Management. • Applied Wireless Computing. • Enterprise Resource Planning Systems. • IT Education. • Societal, Cultural, and Ethical Issues of IT. • Policy, Legal and Global Issues of IT. • Enterprise Database Technology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信