Expert SystemsPub Date : 2025-02-23DOI: 10.1111/exsy.70022
Wei Dai, Wanqiu Kong, Tao Shang, Jianhong Feng, Jiaji Wu, Tan Qu
{"title":"Guideline for Novel Fine-Grained Sentiment Annotation and Data Curation: A Case Study","authors":"Wei Dai, Wanqiu Kong, Tao Shang, Jianhong Feng, Jiaji Wu, Tan Qu","doi":"10.1111/exsy.70022","DOIUrl":"https://doi.org/10.1111/exsy.70022","url":null,"abstract":"<div>\u0000 \u0000 <p>Driven by the rise of the internet, recent years have witnessed the gradual manifestation of commercial values of online reviews. In movie industry, sentiment analysis serves as the foundation for mining user preferences among diverse and multi-layered audiences, providing insight into the market value of movies. As a representative task, aspect-based sentiment analysis (ABSA) aims to analyse and extract fine-grained sentiment elements and their relations in terms of discussed aspects. Relevant studies, particularly in the realm of deep learning research, face challenges due to insufficient annotated data. To alleviate this problem, we propose a guideline for fine-grained sentiment annotations that defines aspect categories, describes the method for annotating aspect sentiment triplets, either simple or complex and designs a scheme to represent hierarchical labels. Based on this, an ABSA dataset tailored for the movie domain is curated by annotating on 1100 Chinese short reviews acquired from Douban. Applicability of both the annotation guideline and curated data is evaluated through inter-annotator consistency and self-consistency checks, and domain adaptation assessment of e-commerce and healthcare cases. Predictive performance of machine learning models on this dataset shed light on possible applications in more fine-grained sentiment analysis in the movie domain, for example, figuring out the aspects from which to stimulate viewership and influence public opinions, thereby providing substantial support for the movie's box office performance. Finally, we extended our fine-grained sentiment annotation guideline to the e-commerce and healthcare. Through empirical experimentation, we demonstrated the universality of these guideline across diverse domains.</p>\u0000 </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143475595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert SystemsPub Date : 2025-02-23DOI: 10.1111/exsy.70015
Yifei Wang, Ze Han, Xiangzheng Deng
{"title":"Water-Energy-Carbon Nexus Within the Urban Eco-Transformation of the Beijing-Tianjin-Hebei Region","authors":"Yifei Wang, Ze Han, Xiangzheng Deng","doi":"10.1111/exsy.70015","DOIUrl":"https://doi.org/10.1111/exsy.70015","url":null,"abstract":"<div>\u0000 \u0000 <p>Driven by rapid urbanisation, the Beijing-Tianjin-Hebei region (BTH) has experienced a dramatic increase in resource consumption and environmental strain. Investigating the relationships among water, energy and carbon can help balance efficient resource utilisation, environmental conservation and economic growth, while promoting sustainable urban development. This study develops an analytical framework for the water-energy-carbon nexus within the urban eco-transformation. Specifically, this study first illustrates the conceptual model for the interaction among water use, energy consumption and carbon emissions theoretically, then examines the water-energy-carbon dynamics in urbanisation and ecological transition of the BTH region. Furthermore, an empirical analysis was conducted taking Beijing city as the case study area to explore the water-energy-carbon nexus and its decoupling with socio-economic development. Results show rapid urbanisation has significantly increased population and economic scale, exerting substantial pressure on water resources, energy supply and the environment. The study reveals a significant positive interaction between water consumption, electricity consumption and carbon emissions in Beijing, with an inverted U-shaped parabolic relationship between GDP and population. Beijing is expected to decouple economic growth from carbon emissions after 2030 and from water consumption after 2037, reducing resource consumption and carbon emissions while sustaining economic growth. To achieve sustainable development, it is recommended that the Beijing-Tianjin-Hebei region accelerate industrial transformation, enhance water resource efficiency, develop clean energy and improve power system efficiency. This paper provides a theoretical foundation and practical insights for decision-making and facilitates ecological urbanisation in the Beijing-Tianjin-Hebei region.</p>\u0000 </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143475594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert SystemsPub Date : 2025-02-16DOI: 10.1111/exsy.70021
Enrique Bermejo, Antonio David Villegas, Javier Irurita, Sergio Damas, Oscar Cordón
{"title":"Interpretable Machine Learning for Age-at-Death Estimation From the Pubic Symphysis","authors":"Enrique Bermejo, Antonio David Villegas, Javier Irurita, Sergio Damas, Oscar Cordón","doi":"10.1111/exsy.70021","DOIUrl":"https://doi.org/10.1111/exsy.70021","url":null,"abstract":"<div>\u0000 \u0000 <p>Age-at-death estimation is an arduous task in human identification based on characteristics such as appearance, morphology or ossification patterns in skeletal remains. This process is performed manually, although in recent years there have been several studies that attempt to automate it. One of the most recent approaches involves considering interpretable machine learning methods, obtaining simple and easily understandable models. The ultimate goal is not to fully automate the task but to obtain an accurate model supporting the forensic anthropologists in the age-at-death estimation process. We propose a semi-automatic method for age-at-death estimation based on nine pubic symphysis traits identified from Todd's pioneering method. Genetic programming is used to learn simple mathematical expressions following a symbolic regression process, also developing feature selection. Our method follows a component-scoring approach where the values of the different traits are evaluated by the expert and aggregated by the corresponding mathematical expression to directly estimate the numeric age-at-death value. Oversampling methods are considered to deal with the strongly imbalanced nature of the problem. State-of-the-art performance is achieved thanks to an interpretable model structure that allows us to both validate existing knowledge and extract some new insights in the discipline.</p>\u0000 </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143423753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert SystemsPub Date : 2025-02-13DOI: 10.1111/exsy.70009
Kerenalli Sudarshana, Yendapalli Vamsidhar
{"title":"UAM-Net: Robust Deepfake Detection Through Hybrid Attention Into Scalable Convolutional Network","authors":"Kerenalli Sudarshana, Yendapalli Vamsidhar","doi":"10.1111/exsy.70009","DOIUrl":"https://doi.org/10.1111/exsy.70009","url":null,"abstract":"<p>The recent advancements in computer vision have transformed data manipulation detection into a significantly challenging task. Deepfakes are advanced manipulation methods for generating highly convincing synthetic media wherein one digitally forges an individual's visuals. Therefore, safeguarding the authenticity and integrity of digital content against such forgeries and developing robust detection methods is essential. Identifying manipulated regions and channels within deepfake images is especially critical in countering these forgeries. Introducing attention features into the classification pipeline enhances the detection of subtle manipulations. Such subtle manipulations are typical of deepfake content. This study presents a novel feature selection approach, a Unified Attention Mechanism into convolutional networks—the <b>‘UAM-Net’</b>. The UAM-Net framework concurrently integrates spatial and channel attention features into the data-driven scalable convolutional features. The UAM-Net was trained and evaluated on the DeepFake Detection Challenge Preview (DFDC-P) data set. It was then cross-validated on combined FaceForensics++ and CelebA-DF data sets. UAM-Net has achieved outstanding results, including an accuracy of 98.07%, precision of 97.91%, recall of 98.23%, F1 score of 98.07% and an AUC-ROC score of 99.82%. The UAM-Net model maintained strong performance on the combined data set and achieved 89.7% accuracy, 85.4% precision, 95.8% recall, 90.3% F1 score, and AUC ROC score of 96.8%. The UAM-Net also demonstrated robustness to degraded input quality with 96.98% accuracy and 97% AUC-ROC on the spatially compressed DFDC-P data set. Thus, the model would adapt to real-world conditions, as evidenced by a 97% AUC-ROC on randomly blurred data sets.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/exsy.70009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generative AI for Finance: Applications, Case Studies and Challenges","authors":"Siva Sai, Keya Arunakar, Vinay Chamola, Amir Hussain, Pranav Bisht, Sanjeev Kumar","doi":"10.1111/exsy.70018","DOIUrl":"https://doi.org/10.1111/exsy.70018","url":null,"abstract":"<p>Generative AI (GAI), which has become increasingly popular nowadays, can be considered a brilliant computational machine that can not only assist with simple searching and organising tasks but also possesses the capability to propose new ideas, make decisions on its own and derive better conclusions from complex inputs. Finance comprises various difficult and time-consuming tasks that require significant human effort and are highly prone to errors, such as creating and managing financial documents and reports. Hence, incorporating GAI to simplify processes and make them hassle-free will be consequential. Integrating GAI with finance can open new doors of possibility. With its capacity to enhance decision-making and provide more effective personalised insights, it has the power to optimise financial procedures. In this paper, we address the research gap of the lack of a detailed study exploring the possibilities and advancements of the integration of GAI with finance. We discuss applications that include providing financial consultations to customers, making predictions about the stock market, identifying and addressing fraudulent activities, evaluating risks, and organising unstructured data. We explore real-world examples of GAI, including Finance generative pre-trained transformer (GPT), Bloomberg GPT, and so forth. We look closer at how finance professionals work with AI-integrated systems and tools and how this affects the overall process. We address the challenges presented by comprehensibility, bias, resource demands, and security issues while at the same time emphasising solutions such as GPTs specialised in financial contexts. To the best of our knowledge, this is the first comprehensive paper dealing with GAI for finance.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/exsy.70018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert SystemsPub Date : 2025-02-13DOI: 10.1111/exsy.70005
Rula A. Hamid, Idrees A. Zahid, A. S. Albahri, O. S. Albahri, A. H. Alamoodi, Laith Alzubaidi, Iman Mohamad Sharaf, Shahad Sabbar Joudar, YuanTong Gu, Z. T. Al-qaysi
{"title":"Fuzzy Decision-Making Framework for Evaluating Hybrid Detection Models of Trauma Patients","authors":"Rula A. Hamid, Idrees A. Zahid, A. S. Albahri, O. S. Albahri, A. H. Alamoodi, Laith Alzubaidi, Iman Mohamad Sharaf, Shahad Sabbar Joudar, YuanTong Gu, Z. T. Al-qaysi","doi":"10.1111/exsy.70005","DOIUrl":"https://doi.org/10.1111/exsy.70005","url":null,"abstract":"<div>\u0000 \u0000 <p>This study introduces a new multi-criteria decision-making (MCDM) framework to evaluate trauma injury detection models in intensive care units (ICUs). This research addresses the challenges associated with diverse machine learning (ML) models, inconsistencies, conflicting priorities, and the importance of metrics. The developed methodology consists of three phases: dataset identification and pre-processing, hybrid model development, and an evaluation/benchmarking framework. Through meticulous pre-processing, the dataset is tailored to focus on adult trauma patients. Forty hybrid models were developed by combining eight ML algorithms with four filter-based feature-selection methods and principal component analysis (PCA) as a dimensionality reduction method, and these models were evaluated using seven metrics. The weight coefficients for these metrics are determined using the 2-tuple Linguistic Fermatean Fuzzy-Weighted Zero-Inconsistency (2TLF-FWZIC) method. The <i>Vlsekriterijumska Optimizcija I Kompromisno Resenje</i> (VIKOR) approach is applied to rank the developed models. According to 2TLF-FWZIC, classification accuracy (CA) and precision obtained the highest importance weights of 0.2439 and 0.1805, respectively, while F1, training time, and test time obtained the lowest weights of 0.1055, 0.0886, and 0.1111, respectively. The benchmarking results revealed the following top-performing models: the Gini index with logistic regression (GI-LR), the Gini index with a decision tree (GI_DT), and the information gain with a decision tree (IG_DT), with VIKOR Q score values of 0.016435, 0.023804, and 0.042077, respectively. The proposed MCDM framework is assessed and examined using systematic ranking, sensitivity analysis, validation of the best-selected model using two unseen trauma datasets, and mode explainability using the SHapley Additive exPlanations (SHAP) method. We benchmarked the proposed methodology against three other benchmark studies and achieved a score of 100% across six key areas. The proposed methodology provides several insights into the empirical synthesis of this study. It contributes to advancing medical informatics by enhancing the understanding and selection of trauma injury detection models for ICUs.</p>\u0000 </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert SystemsPub Date : 2025-02-11DOI: 10.1111/exsy.70012
Samuel Suárez-Marcote, Laura Morán-Fernández, Verónica Bolón-Canedo
{"title":"Optimising Resource Use Through Low-Precision Feature Selection: A Performance Analysis of Logarithmic Division and Stochastic Rounding","authors":"Samuel Suárez-Marcote, Laura Morán-Fernández, Verónica Bolón-Canedo","doi":"10.1111/exsy.70012","DOIUrl":"https://doi.org/10.1111/exsy.70012","url":null,"abstract":"<div>\u0000 \u0000 <p>The growth in the number of wearable devices has increased the amount of data produced daily. Simultaneously, the limitations of such devices has also led to a growing interest in the implementation of machine learning algorithms with low-precision computation. We propose green and efficient modifications of state-of-the-art feature selection methods based on information theory and fixed-point representation. We tested two potential improvements: stochastic rounding to prevent information loss, and logarithmic division to improve computational and energy efficiency. Experiments with several datasets showed comparable results to baseline methods, with minimal information loss in both feature selection and subsequent classification steps. Our low-precision approach proved viable even for complex datasets like microarrays, making it suitable for energy-efficient internet-of-things (IoT) devices. While further investigation into stochastic rounding did not yield significant improvements, the use of logarithmic division for probability approximation showed promising results without compromising classification performance. Our findings offer valuable insights into resource-efficient feature selection that contribute to IoT device performance and sustainability.</p>\u0000 </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143389230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert SystemsPub Date : 2025-02-09DOI: 10.1111/exsy.13840
Jelke Wibbeke, Sebastian Rohjans, Andreas Rauh
{"title":"Quantification of Data Imbalance","authors":"Jelke Wibbeke, Sebastian Rohjans, Andreas Rauh","doi":"10.1111/exsy.13840","DOIUrl":"https://doi.org/10.1111/exsy.13840","url":null,"abstract":"<p>In this article, we propose a novel approach to quantify the imbalance in data, addressing a significant gap in the field of regression analysis. Real-world datasets often exhibit an inherent imbalance in their data distribution, which adversely affects learning algorithms such as those used in neural networks. This results in less accurate learning of rare occurrences and a model bias towards more frequent cases, posing challenges in scenarios where rare events are crucial, like energy load prediction. While many solutions exist for classification problems with imbalanced data, regression problems lack adequate research. To address this, we introduce a method to quantify data imbalance by defining it as the disparity between the probability distribution of the data and a relevance-associated distribution. Our approach includes various metrics that can handle multivariate data, allowing for the identification of imbalanced samples and the abstract quantification of imbalance through the mean imbalance ratio. This method facilitates the comparison of regression datasets based on their imbalance, providing insights into dataset quality and evaluating data resampling techniques. We validate our approach using synthetic data and compare it to established metrics such as the Kullback–Leibler divergence and the Wasserstein metric. Furthermore, analysis of real datasets shows a moderate correlation between sample rarity and the approximation error of neural networks, extreme gradient boosting trees and random forests, indicating that underrepresented samples are linked to higher approximation errors.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/exsy.13840","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143380160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert SystemsPub Date : 2025-02-09DOI: 10.1111/exsy.70014
Adeel Munawar, Mongkut Piantanakulchai
{"title":"Machine Learning-Driven Passenger Demand Forecasting for Autonomous Taxi Transportation Systems in Smart Cities","authors":"Adeel Munawar, Mongkut Piantanakulchai","doi":"10.1111/exsy.70014","DOIUrl":"https://doi.org/10.1111/exsy.70014","url":null,"abstract":"<div>\u0000 \u0000 <p>Autonomous Taxis (ATs) have seen remarkable global proliferation in recent years owing to the widespread adoption and advancements in Artificial Intelligence (AI) across various domains. ATs play a crucial role in Intelligent Transportation Systems (ITS) in smart cities. However, the effectiveness of ITS relies heavily on accurately forecasting the passenger demand for ATs, which poses a significant challenge. Precise prediction of passenger demand is essential for minimising waiting times and unnecessary cruising of ATs in metropolitan areas, which helps conserve energy. To address this issue, this study proposed an adaptive Bayesian Regularisation Backpropagation Neural Network (BRBNN) augmented with a Machine Learning (ML) model to predict passenger demand in different regions of metropolitan cities specifically for ATs. The study conducted extensive simulations using a real-world dataset collected from 4781 taxis in Bangkok, Thailand. Using MATLAB2022b, the proposed model compared various state of art methods and existing research. The results indicate that proposed model outperforms existing methods in terms of performance metrics such as Root Mean Square Error (RMSE) and <i>R</i>-squared (<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msup>\u0000 <mi>R</mi>\u0000 <mn>2</mn>\u0000 </msup>\u0000 </mrow>\u0000 <annotation>$$ {R}^2 $$</annotation>\u0000 </semantics></math>) for passenger demand forecasting. These findings validated the effectiveness of the prediction model and its ability to accurately forecast passenger demand for ATs, thereby contributing to the advancement of efficient transportation systems in smart cities.</p>\u0000 </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143380159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert SystemsPub Date : 2025-02-09DOI: 10.1111/exsy.70002
Joaquim Arlandis, Rafael Llobet, J. Ramón Navarro Cerdán, Laura Arnal, François Signol, Juan-Carlos Perez-Cortes
{"title":"Feature Identification Using Hypotheses of Relevance and a 2D-Cascade of SEQENS Ensembles","authors":"Joaquim Arlandis, Rafael Llobet, J. Ramón Navarro Cerdán, Laura Arnal, François Signol, Juan-Carlos Perez-Cortes","doi":"10.1111/exsy.70002","DOIUrl":"https://doi.org/10.1111/exsy.70002","url":null,"abstract":"<div>\u0000 \u0000 <p>SEQENS is an ensemble method aimed at feature identification that has demonstrated strong performance in identifying relevant genes in high-dimensional spaces, across different synthetic tasks. In this paper, we first introduce the differences between <i>feature importance</i>, <i>feature selection (FS)</i> and <i>feature identification</i> concepts. Following this, we present a framework based on SEQENS covering the following contributions: (1) computing the hypergeometric <i>p-</i>value of the features of a SEQENS output ranking in order to be able to establish a threshold between relevant and non-relevant features; (2) extending SEQENS by introducing the use of preselected features as hypotheses of relevance in the sequential FS, which may help to attract other features that might exhibit weak correlation with the target on their own, but gain relevance when combined with the preselected ones and; (3) designing an automated process based on a 2D-cascade of SEQENS ensembles to obtain a <i>purged feature set</i>, or PFS, that is, having as many relevant features, and as few non-relevant, as possible. The framework presented, named pc–SEQENS, integrates the former techniques so that the PFS is used as a hypothesis of relevance in a SEQENS ensemble. Performance is analysed in a gene expression identification task using the E-MTAB-3732 public database and synthetic targets. pc–SEQENS is compared to other state-of-the-art methods, including SEQENS to check the effect of using hypotheses of relevance. On average, the proposed framework identifies better the relevant genes, especially in unfavourable sample-to-dimension rates, and exhibits a stronger stability.</p>\u0000 </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143380215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}