AEGA:基于ANOVA和扩展遗传算法的增强特征选择,用于在线客户评论分析。

IF 2.5 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Gyananjaya Tripathy, Aakanksha Sharaff
{"title":"AEGA:基于ANOVA和扩展遗传算法的增强特征选择,用于在线客户评论分析。","authors":"Gyananjaya Tripathy,&nbsp;Aakanksha Sharaff","doi":"10.1007/s11227-023-05179-2","DOIUrl":null,"url":null,"abstract":"<p><p>Sentiment analysis involves extricating and interpreting people's views, feelings, beliefs, etc., about diverse actualities such as services, goods, and topics. People intend to investigate the users' opinions on the online platform to achieve better performance. Regardless, the high-dimensional feature set in an online review study affects the interpretation of classification. Several studies have implemented different feature selection techniques; however, getting a high accuracy with a very minimal number of features is yet to be accomplished. This paper develops an effective hybrid approach based on an enhanced genetic algorithm (GA) and analysis of variance (ANOVA) to achieve this purpose. To beat the local minima convergence problem, this paper uses a unique two-phase crossover and impressive selection approach, gaining high exploration and fast convergence of the model. The use of ANOVA drastically reduces the feature size to minimize the computational burden of the model. Experiments are performed to estimate the algorithm performance using different conventional classifiers and algorithms like GA, Particle Swarm Optimization (PSO), Recursive Feature Elimination (RFE), Random Forest, ExtraTree, AdaBoost, GradientBoost, and XGBoost. The proposed novel approach gives impressive results using the Amazon Review dataset with an accuracy of 78.60 %, F1 score of 79.38 %, and an average precision of 0.87, and the Restaurant Customer Review dataset with an accuracy of 77.70 %, F1 score of 78.24 %, and average precision of 0.89 as compared to other existing algorithms. The result shows that the proposed model outperforms other algorithms with nearly 45 and 42% fewer features for the Amazon Review and Restaurant Customer Review datasets.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":" ","pages":"1-30"},"PeriodicalIF":2.5000,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10031171/pdf/","citationCount":"0","resultStr":"{\"title\":\"AEGA: enhanced feature selection based on ANOVA and extended genetic algorithm for online customer review analysis.\",\"authors\":\"Gyananjaya Tripathy,&nbsp;Aakanksha Sharaff\",\"doi\":\"10.1007/s11227-023-05179-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Sentiment analysis involves extricating and interpreting people's views, feelings, beliefs, etc., about diverse actualities such as services, goods, and topics. People intend to investigate the users' opinions on the online platform to achieve better performance. Regardless, the high-dimensional feature set in an online review study affects the interpretation of classification. Several studies have implemented different feature selection techniques; however, getting a high accuracy with a very minimal number of features is yet to be accomplished. This paper develops an effective hybrid approach based on an enhanced genetic algorithm (GA) and analysis of variance (ANOVA) to achieve this purpose. To beat the local minima convergence problem, this paper uses a unique two-phase crossover and impressive selection approach, gaining high exploration and fast convergence of the model. The use of ANOVA drastically reduces the feature size to minimize the computational burden of the model. Experiments are performed to estimate the algorithm performance using different conventional classifiers and algorithms like GA, Particle Swarm Optimization (PSO), Recursive Feature Elimination (RFE), Random Forest, ExtraTree, AdaBoost, GradientBoost, and XGBoost. The proposed novel approach gives impressive results using the Amazon Review dataset with an accuracy of 78.60 %, F1 score of 79.38 %, and an average precision of 0.87, and the Restaurant Customer Review dataset with an accuracy of 77.70 %, F1 score of 78.24 %, and average precision of 0.89 as compared to other existing algorithms. The result shows that the proposed model outperforms other algorithms with nearly 45 and 42% fewer features for the Amazon Review and Restaurant Customer Review datasets.</p>\",\"PeriodicalId\":50034,\"journal\":{\"name\":\"Journal of Supercomputing\",\"volume\":\" \",\"pages\":\"1-30\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2023-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10031171/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Supercomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11227-023-05179-2\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Supercomputing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11227-023-05179-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

情感分析包括提取和解释人们对服务、商品和主题等不同现实的看法、感受、信仰等。人们打算调查用户在网络平台上的意见,以获得更好的表现。无论如何,在线综述研究中的高维特征集会影响分类的解释。一些研究已经实现了不同的特征选择技术;然而,用极少量的特征获得高精度还没有实现。为了实现这一目的,本文开发了一种基于增强遗传算法(GA)和方差分析(ANOVA)的有效混合方法。为了克服局部极小收敛问题,本文采用了一种独特的两阶段交叉和令人印象深刻的选择方法,获得了模型的高探索性和快速收敛性。ANOVA的使用大大减少了特征大小,以最小化模型的计算负担。使用不同的传统分类器和算法,如GA、粒子群优化(PSO)、递归特征消除(RFE)、随机森林、ExtraTree、AdaBoost、GradientBoost和XGBoost,进行了算法性能评估实验。与其他现有算法相比,所提出的新方法使用准确率为78.60%、F1得分为79.38%、平均精度为0.87的Amazon Review数据集和准确率为77.70%、F1得分78.24%、平均精度0.89的Restaurant Customer Review数据集得出了令人印象深刻的结果。结果表明,所提出的模型在Amazon Review和Restaurant Customer Review数据集的特征减少了近45%和42%,优于其他算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
AEGA: enhanced feature selection based on ANOVA and extended genetic algorithm for online customer review analysis.

Sentiment analysis involves extricating and interpreting people's views, feelings, beliefs, etc., about diverse actualities such as services, goods, and topics. People intend to investigate the users' opinions on the online platform to achieve better performance. Regardless, the high-dimensional feature set in an online review study affects the interpretation of classification. Several studies have implemented different feature selection techniques; however, getting a high accuracy with a very minimal number of features is yet to be accomplished. This paper develops an effective hybrid approach based on an enhanced genetic algorithm (GA) and analysis of variance (ANOVA) to achieve this purpose. To beat the local minima convergence problem, this paper uses a unique two-phase crossover and impressive selection approach, gaining high exploration and fast convergence of the model. The use of ANOVA drastically reduces the feature size to minimize the computational burden of the model. Experiments are performed to estimate the algorithm performance using different conventional classifiers and algorithms like GA, Particle Swarm Optimization (PSO), Recursive Feature Elimination (RFE), Random Forest, ExtraTree, AdaBoost, GradientBoost, and XGBoost. The proposed novel approach gives impressive results using the Amazon Review dataset with an accuracy of 78.60 %, F1 score of 79.38 %, and an average precision of 0.87, and the Restaurant Customer Review dataset with an accuracy of 77.70 %, F1 score of 78.24 %, and average precision of 0.89 as compared to other existing algorithms. The result shows that the proposed model outperforms other algorithms with nearly 45 and 42% fewer features for the Amazon Review and Restaurant Customer Review datasets.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Supercomputing
Journal of Supercomputing 工程技术-工程:电子与电气
CiteScore
6.30
自引率
12.10%
发文量
734
审稿时长
13 months
期刊介绍: The Journal of Supercomputing publishes papers on the technology, architecture and systems, algorithms, languages and programs, performance measures and methods, and applications of all aspects of Supercomputing. Tutorial and survey papers are intended for workers and students in the fields associated with and employing advanced computer systems. The journal also publishes letters to the editor, especially in areas relating to policy, succinct statements of paradoxes, intuitively puzzling results, partial results and real needs. Published theoretical and practical papers are advanced, in-depth treatments describing new developments and new ideas. Each includes an introduction summarizing prior, directly pertinent work that is useful for the reader to understand, in order to appreciate the advances being described.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信