Mucahit Cevik, Sabrina Angco, Elham Heydarigharaei, Hadi Jahanshahi, Nicholas Prayogo
{"title":"Active Learning for Multi-way Sensitivity Analysis with Application to Disease Screening Modeling.","authors":"Mucahit Cevik, Sabrina Angco, Elham Heydarigharaei, Hadi Jahanshahi, Nicholas Prayogo","doi":"10.1007/s41666-022-00117-y","DOIUrl":null,"url":null,"abstract":"<p><p>Sensitivity analysis is an important aspect of model development as it can be used to assess the level of confidence that is associated with the outcomes of a study. In many practical problems, sensitivity analysis involves evaluating a large number of parameter combinations which may require an extensive amount of time and resources. However, such a computational burden can be avoided by identifying smaller subsets of parameter combinations that can be later used to generate the desired outcomes for other parameter combinations. In this study, we investigate machine learning-based approaches for speeding up the sensitivity analysis. Furthermore, we apply feature selection methods to identify the relative importance of quantitative model parameters in terms of their predictive ability on the outcomes. Finally, we highlight the effectiveness of active learning strategies in improving the sensitivity analysis processes by reducing the total number of quantitative model runs required to construct a high-performance prediction model. Our experiments on two datasets obtained from the sensitivity analysis performed for two disease screening modeling studies indicate that ensemble methods such as Random Forests and XGBoost consistently outperform other machine learning algorithms in the prediction task of the associated sensitivity analysis. In addition, we note that active learning can lead to significant speed-ups in sensitivity analysis by enabling the selection of more useful parameter combinations (i.e., instances) to be used for prediction models.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9309115/pdf/41666_2022_Article_117.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41666-022-00117-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
Sensitivity analysis is an important aspect of model development as it can be used to assess the level of confidence that is associated with the outcomes of a study. In many practical problems, sensitivity analysis involves evaluating a large number of parameter combinations which may require an extensive amount of time and resources. However, such a computational burden can be avoided by identifying smaller subsets of parameter combinations that can be later used to generate the desired outcomes for other parameter combinations. In this study, we investigate machine learning-based approaches for speeding up the sensitivity analysis. Furthermore, we apply feature selection methods to identify the relative importance of quantitative model parameters in terms of their predictive ability on the outcomes. Finally, we highlight the effectiveness of active learning strategies in improving the sensitivity analysis processes by reducing the total number of quantitative model runs required to construct a high-performance prediction model. Our experiments on two datasets obtained from the sensitivity analysis performed for two disease screening modeling studies indicate that ensemble methods such as Random Forests and XGBoost consistently outperform other machine learning algorithms in the prediction task of the associated sensitivity analysis. In addition, we note that active learning can lead to significant speed-ups in sensitivity analysis by enabling the selection of more useful parameter combinations (i.e., instances) to be used for prediction models.
期刊介绍:
Journal of Healthcare Informatics Research serves as a publication venue for the innovative technical contributions highlighting analytics, systems, and human factors research in healthcare informatics.Journal of Healthcare Informatics Research is concerned with the application of computer science principles, information science principles, information technology, and communication technology to address problems in healthcare, and everyday wellness. Journal of Healthcare Informatics Research highlights the most cutting-edge technical contributions in computing-oriented healthcare informatics. The journal covers three major tracks: (1) analytics—focuses on data analytics, knowledge discovery, predictive modeling; (2) systems—focuses on building healthcare informatics systems (e.g., architecture, framework, design, engineering, and application); (3) human factors—focuses on understanding users or context, interface design, health behavior, and user studies of healthcare informatics applications. Topics include but are not limited to: · healthcare software architecture, framework, design, and engineering;· electronic health records· medical data mining· predictive modeling· medical information retrieval· medical natural language processing· healthcare information systems· smart health and connected health· social media analytics· mobile healthcare· medical signal processing· human factors in healthcare· usability studies in healthcare· user-interface design for medical devices and healthcare software· health service delivery· health games· security and privacy in healthcare· medical recommender system· healthcare workflow management· disease profiling and personalized treatment· visualization of medical data· intelligent medical devices and sensors· RFID solutions for healthcare· healthcare decision analytics and support systems· epidemiological surveillance systems and intervention modeling· consumer and clinician health information needs, seeking, sharing, and use· semantic Web, linked data, and ontology· collaboration technologies for healthcare· assistive and adaptive ubiquitous computing technologies· statistics and quality of medical data· healthcare delivery in developing countries· health systems modeling and simulation· computer-aided diagnosis