A Novel Ensemble Machine Learning Approach for Interpretable Modeling, Feature Extraction and Selection With Applications to Medical and Biomedical Signals and Data
IF 1.5 4区 计算机科学Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING
{"title":"A Novel Ensemble Machine Learning Approach for Interpretable Modeling, Feature Extraction and Selection With Applications to Medical and Biomedical Signals and Data","authors":"Bo Sun, Hua-Liang Wei","doi":"10.1002/cpe.70697","DOIUrl":null,"url":null,"abstract":"<p>Feature extraction and selection are crucial in biomedical data analysis to address high dimensionality, reduce computational complexity, and enhance model interpretability. However, traditional methods often focus on individual feature importance, overlooking complex inter-feature relationships, especially when processing and modeling dynamic and time-series data. In this study, we propose a novel framework that integrates Feature Co-occurrence Networks (FCN) with global importance scoring via the PageRank algorithm, which is built on a parametric Nonlinear AutoRegressive with eXogenous inputs (NARX) model structure to better capture temporal dependencies in sequential data. The proposed NARX-FCN-PageRank approach combines the strengths of multiple feature selection strategies while leveraging network analysis to identify stable and representative feature subsets. Extensive evaluations across diverse biomedical datasets, including both static and dynamic scenarios, demonstrate that our method effectively reduces feature dimensionality without compromising predictive performance. Moreover, the network visualizations provide valuable insights into the interdependencies and centrality of selected features, supporting model interpretability and enhancing trustworthiness. The NARX-FCN-PageRank framework thus offers a versatile and interpretable solution for feature selection in biomedical data analysis, with the potential to facilitate more efficient and reliable modeling in clinical and medical research applications.</p>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"38 8","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cpe.70697","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70697","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Feature extraction and selection are crucial in biomedical data analysis to address high dimensionality, reduce computational complexity, and enhance model interpretability. However, traditional methods often focus on individual feature importance, overlooking complex inter-feature relationships, especially when processing and modeling dynamic and time-series data. In this study, we propose a novel framework that integrates Feature Co-occurrence Networks (FCN) with global importance scoring via the PageRank algorithm, which is built on a parametric Nonlinear AutoRegressive with eXogenous inputs (NARX) model structure to better capture temporal dependencies in sequential data. The proposed NARX-FCN-PageRank approach combines the strengths of multiple feature selection strategies while leveraging network analysis to identify stable and representative feature subsets. Extensive evaluations across diverse biomedical datasets, including both static and dynamic scenarios, demonstrate that our method effectively reduces feature dimensionality without compromising predictive performance. Moreover, the network visualizations provide valuable insights into the interdependencies and centrality of selected features, supporting model interpretability and enhancing trustworthiness. The NARX-FCN-PageRank framework thus offers a versatile and interpretable solution for feature selection in biomedical data analysis, with the potential to facilitate more efficient and reliable modeling in clinical and medical research applications.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.