Ensemble feature selection via CoCoSo method extended to interval-valued intuitionistic fuzzy environment

IF 4.4 2区 数学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
K. Janani , S.S. Mohanrasu , Ardak Kashkynbayev , R. Rakkiyappan
{"title":"Ensemble feature selection via CoCoSo method extended to interval-valued intuitionistic fuzzy environment","authors":"K. Janani ,&nbsp;S.S. Mohanrasu ,&nbsp;Ardak Kashkynbayev ,&nbsp;R. Rakkiyappan","doi":"10.1016/j.matcom.2024.09.023","DOIUrl":null,"url":null,"abstract":"<div><div>Feature selection is a crucial step in the process of preparing and refining data. By identifying and retaining only the most informative and discriminative features, one can achieve several benefits, including faster training times, reduced risk of overfitting, improved model generalization, and enhanced interpretability. Ensemble feature selection has demonstrated its efficacy in improving the stability and generalization performance of models and is particularly valuable in high-dimensional datasets and complex machine learning tasks, contributing to the creation of more accurate and robust predictive models. This article presents an innovative ensemble feature selection technique through the development of a unique Multi-criteria decision making (MCDM) model, incorporating both rank aggregation principles and a filter-based algorithm. The proposed MCDM model combines the Combined Compromise Solution (CoCoSo) method and the Archimedean operator within interval-valued intuitionistic fuzzy environments, effectively addressing the challenges of vagueness and imprecision in datasets. A customizable feature selection model is introduced, allowing users to define the number of features, employing a sigmoidal function with a tuning parameter for fuzzification. The assignment of entropy weights in the Interval-valued intuitionistic fuzzy set (IVIFS) environment provides priorities to each column. The method’s effectiveness is assessed on real-world datasets, comparing it with existing approaches and validated through statistical tests such as the Friedman test and post-hoc Conover test, emphasizing its significance in comparison to current methodologies. Based on the results obtained, we inferred that our structured approach to ensemble feature selection, utilizing a specific case of the Archimedean operator, demonstrated superior performance across the datasets. This more generalized methodology enhances the robustness and effectiveness of feature selection by leveraging the strengths of the Archimedean operator, resulting in improved data analysis and model accuracy.</div></div>","PeriodicalId":49856,"journal":{"name":"Mathematics and Computers in Simulation","volume":"229 ","pages":"Pages 50-77"},"PeriodicalIF":4.4000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematics and Computers in Simulation","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378475424003781","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Feature selection is a crucial step in the process of preparing and refining data. By identifying and retaining only the most informative and discriminative features, one can achieve several benefits, including faster training times, reduced risk of overfitting, improved model generalization, and enhanced interpretability. Ensemble feature selection has demonstrated its efficacy in improving the stability and generalization performance of models and is particularly valuable in high-dimensional datasets and complex machine learning tasks, contributing to the creation of more accurate and robust predictive models. This article presents an innovative ensemble feature selection technique through the development of a unique Multi-criteria decision making (MCDM) model, incorporating both rank aggregation principles and a filter-based algorithm. The proposed MCDM model combines the Combined Compromise Solution (CoCoSo) method and the Archimedean operator within interval-valued intuitionistic fuzzy environments, effectively addressing the challenges of vagueness and imprecision in datasets. A customizable feature selection model is introduced, allowing users to define the number of features, employing a sigmoidal function with a tuning parameter for fuzzification. The assignment of entropy weights in the Interval-valued intuitionistic fuzzy set (IVIFS) environment provides priorities to each column. The method’s effectiveness is assessed on real-world datasets, comparing it with existing approaches and validated through statistical tests such as the Friedman test and post-hoc Conover test, emphasizing its significance in comparison to current methodologies. Based on the results obtained, we inferred that our structured approach to ensemble feature selection, utilizing a specific case of the Archimedean operator, demonstrated superior performance across the datasets. This more generalized methodology enhances the robustness and effectiveness of feature selection by leveraging the strengths of the Archimedean operator, resulting in improved data analysis and model accuracy.

Abstract Image

扩展至区间值直观模糊环境的 CoCoSo 方法集合特征选择
特征选择是准备和完善数据过程中的关键一步。通过只识别和保留信息量最大、区分度最高的特征,可以获得多种益处,包括缩短训练时间、降低过拟合风险、提高模型泛化能力和增强可解释性。集合特征选择在提高模型的稳定性和泛化性能方面已经证明了它的功效,在高维数据集和复杂的机器学习任务中尤其有价值,有助于创建更准确、更稳健的预测模型。本文通过开发一种独特的多标准决策(MCDM)模型,结合等级聚合原理和基于过滤器的算法,提出了一种创新的集合特征选择技术。所提出的 MCDM 模型在区间值直观模糊环境中结合了组合折中方案(CoCoSo)方法和阿基米德算子,有效地解决了数据集模糊性和不精确性的难题。该方法引入了一个可定制的特征选择模型,允许用户定义特征的数量,并采用一个带有调整参数的西格玛函数进行模糊化。在区间值直观模糊集(IVIFS)环境中分配熵权,为每一列提供优先级。我们在现实世界的数据集上评估了该方法的有效性,将其与现有方法进行了比较,并通过弗里德曼检验和事后 Conover 检验等统计检验进行了验证,强调了该方法与现有方法相比的重要性。根据所获得的结果,我们推断,我们的结构化集合特征选择方法利用了阿基米德算子的特定情况,在所有数据集上都表现出了卓越的性能。这种更具通用性的方法利用阿基米德算子的优势,提高了特征选择的稳健性和有效性,从而改进了数据分析和模型准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Mathematics and Computers in Simulation
Mathematics and Computers in Simulation 数学-计算机:跨学科应用
CiteScore
8.90
自引率
4.30%
发文量
335
审稿时长
54 days
期刊介绍: The aim of the journal is to provide an international forum for the dissemination of up-to-date information in the fields of the mathematics and computers, in particular (but not exclusively) as they apply to the dynamics of systems, their simulation and scientific computation in general. Published material ranges from short, concise research papers to more general tutorial articles. Mathematics and Computers in Simulation, published monthly, is the official organ of IMACS, the International Association for Mathematics and Computers in Simulation (Formerly AICA). This Association, founded in 1955 and legally incorporated in 1956 is a member of FIACC (the Five International Associations Coordinating Committee), together with IFIP, IFAV, IFORS and IMEKO. Topics covered by the journal include mathematical tools in: •The foundations of systems modelling •Numerical analysis and the development of algorithms for simulation They also include considerations about computer hardware for simulation and about special software and compilers. The journal also publishes articles concerned with specific applications of modelling and simulation in science and engineering, with relevant applied mathematics, the general philosophy of systems simulation, and their impact on disciplinary and interdisciplinary research. The journal includes a Book Review section -- and a "News on IMACS" section that contains a Calendar of future Conferences/Events and other information about the Association.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信