Active Learning for Multi-way Sensitivity Analysis with Application to Disease Screening Modeling.

IF 5.9 Q1 Computer Science

Journal of Healthcare Informatics Research Pub Date : 2022-09-01 DOI:10.1007/s41666-022-00117-y

Mucahit Cevik, Sabrina Angco, Elham Heydarigharaei, Hadi Jahanshahi, Nicholas Prayogo

{"title":"Active Learning for Multi-way Sensitivity Analysis with Application to Disease Screening Modeling.","authors":"Mucahit Cevik, Sabrina Angco, Elham Heydarigharaei, Hadi Jahanshahi, Nicholas Prayogo","doi":"10.1007/s41666-022-00117-y","DOIUrl":null,"url":null,"abstract":"<p><p>Sensitivity analysis is an important aspect of model development as it can be used to assess the level of confidence that is associated with the outcomes of a study. In many practical problems, sensitivity analysis involves evaluating a large number of parameter combinations which may require an extensive amount of time and resources. However, such a computational burden can be avoided by identifying smaller subsets of parameter combinations that can be later used to generate the desired outcomes for other parameter combinations. In this study, we investigate machine learning-based approaches for speeding up the sensitivity analysis. Furthermore, we apply feature selection methods to identify the relative importance of quantitative model parameters in terms of their predictive ability on the outcomes. Finally, we highlight the effectiveness of active learning strategies in improving the sensitivity analysis processes by reducing the total number of quantitative model runs required to construct a high-performance prediction model. Our experiments on two datasets obtained from the sensitivity analysis performed for two disease screening modeling studies indicate that ensemble methods such as Random Forests and XGBoost consistently outperform other machine learning algorithms in the prediction task of the associated sensitivity analysis. In addition, we note that active learning can lead to significant speed-ups in sensitivity analysis by enabling the selection of more useful parameter combinations (i.e., instances) to be used for prediction models.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":"6 3","pages":"317-343"},"PeriodicalIF":5.9000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9309115/pdf/41666_2022_Article_117.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41666-022-00117-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

Sensitivity analysis is an important aspect of model development as it can be used to assess the level of confidence that is associated with the outcomes of a study. In many practical problems, sensitivity analysis involves evaluating a large number of parameter combinations which may require an extensive amount of time and resources. However, such a computational burden can be avoided by identifying smaller subsets of parameter combinations that can be later used to generate the desired outcomes for other parameter combinations. In this study, we investigate machine learning-based approaches for speeding up the sensitivity analysis. Furthermore, we apply feature selection methods to identify the relative importance of quantitative model parameters in terms of their predictive ability on the outcomes. Finally, we highlight the effectiveness of active learning strategies in improving the sensitivity analysis processes by reducing the total number of quantitative model runs required to construct a high-performance prediction model. Our experiments on two datasets obtained from the sensitivity analysis performed for two disease screening modeling studies indicate that ensemble methods such as Random Forests and XGBoost consistently outperform other machine learning algorithms in the prediction task of the associated sensitivity analysis. In addition, we note that active learning can lead to significant speed-ups in sensitivity analysis by enabling the selection of more useful parameter combinations (i.e., instances) to be used for prediction models.

查看原文本刊更多论文

主动学习多途径敏感性分析在疾病筛选建模中的应用。

敏感性分析是模型开发的一个重要方面，因为它可以用来评估与研究结果相关的置信度水平。在许多实际问题中，敏感性分析涉及评估大量的参数组合，这可能需要大量的时间和资源。然而，这样的计算负担可以通过识别较小的参数组合子集来避免，这些子集可以稍后用于为其他参数组合生成所需的结果。在这项研究中，我们研究了基于机器学习的方法来加速灵敏度分析。此外，我们应用特征选择方法来识别定量模型参数对结果的预测能力的相对重要性。最后，我们强调了主动学习策略通过减少构建高性能预测模型所需的定量模型运行总数来改善敏感性分析过程的有效性。我们在两项疾病筛选建模研究的敏感性分析中获得的两个数据集上进行的实验表明，随机森林和XGBoost等集成方法在相关敏感性分析的预测任务中始终优于其他机器学习算法。此外，我们注意到主动学习可以通过选择更有用的参数组合(即实例)用于预测模型，从而导致灵敏度分析的显着加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Healthcare Informatics Research Computer Science-Computer Science Applications

CiteScore

13.60

自引率

1.70%

发文量

期刊介绍： Journal of Healthcare Informatics Research serves as a publication venue for the innovative technical contributions highlighting analytics, systems, and human factors research in healthcare informatics.Journal of Healthcare Informatics Research is concerned with the application of computer science principles, information science principles, information technology, and communication technology to address problems in healthcare, and everyday wellness. Journal of Healthcare Informatics Research highlights the most cutting-edge technical contributions in computing-oriented healthcare informatics. The journal covers three major tracks: (1) analytics—focuses on data analytics, knowledge discovery, predictive modeling; (2) systems—focuses on building healthcare informatics systems (e.g., architecture, framework, design, engineering, and application); (3) human factors—focuses on understanding users or context, interface design, health behavior, and user studies of healthcare informatics applications. Topics include but are not limited to: · healthcare software architecture, framework, design, and engineering;· electronic health records· medical data mining· predictive modeling· medical information retrieval· medical natural language processing· healthcare information systems· smart health and connected health· social media analytics· mobile healthcare· medical signal processing· human factors in healthcare· usability studies in healthcare· user-interface design for medical devices and healthcare software· health service delivery· health games· security and privacy in healthcare· medical recommender system· healthcare workflow management· disease profiling and personalized treatment· visualization of medical data· intelligent medical devices and sensors· RFID solutions for healthcare· healthcare decision analytics and support systems· epidemiological surveillance systems and intervention modeling· consumer and clinician health information needs, seeking, sharing, and use· semantic Web, linked data, and ontology· collaboration technologies for healthcare· assistive and adaptive ubiquitous computing technologies· statistics and quality of medical data· healthcare delivery in developing countries· health systems modeling and simulation· computer-aided diagnosis