Active Learning for Multi-way Sensitivity Analysis with Application to Disease Screening Modeling.

IF 5.9 Q1 Computer Science
Mucahit Cevik, Sabrina Angco, Elham Heydarigharaei, Hadi Jahanshahi, Nicholas Prayogo
{"title":"Active Learning for Multi-way Sensitivity Analysis with Application to Disease Screening Modeling.","authors":"Mucahit Cevik,&nbsp;Sabrina Angco,&nbsp;Elham Heydarigharaei,&nbsp;Hadi Jahanshahi,&nbsp;Nicholas Prayogo","doi":"10.1007/s41666-022-00117-y","DOIUrl":null,"url":null,"abstract":"<p><p>Sensitivity analysis is an important aspect of model development as it can be used to assess the level of confidence that is associated with the outcomes of a study. In many practical problems, sensitivity analysis involves evaluating a large number of parameter combinations which may require an extensive amount of time and resources. However, such a computational burden can be avoided by identifying smaller subsets of parameter combinations that can be later used to generate the desired outcomes for other parameter combinations. In this study, we investigate machine learning-based approaches for speeding up the sensitivity analysis. Furthermore, we apply feature selection methods to identify the relative importance of quantitative model parameters in terms of their predictive ability on the outcomes. Finally, we highlight the effectiveness of active learning strategies in improving the sensitivity analysis processes by reducing the total number of quantitative model runs required to construct a high-performance prediction model. Our experiments on two datasets obtained from the sensitivity analysis performed for two disease screening modeling studies indicate that ensemble methods such as Random Forests and XGBoost consistently outperform other machine learning algorithms in the prediction task of the associated sensitivity analysis. In addition, we note that active learning can lead to significant speed-ups in sensitivity analysis by enabling the selection of more useful parameter combinations (i.e., instances) to be used for prediction models.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9309115/pdf/41666_2022_Article_117.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41666-022-00117-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Sensitivity analysis is an important aspect of model development as it can be used to assess the level of confidence that is associated with the outcomes of a study. In many practical problems, sensitivity analysis involves evaluating a large number of parameter combinations which may require an extensive amount of time and resources. However, such a computational burden can be avoided by identifying smaller subsets of parameter combinations that can be later used to generate the desired outcomes for other parameter combinations. In this study, we investigate machine learning-based approaches for speeding up the sensitivity analysis. Furthermore, we apply feature selection methods to identify the relative importance of quantitative model parameters in terms of their predictive ability on the outcomes. Finally, we highlight the effectiveness of active learning strategies in improving the sensitivity analysis processes by reducing the total number of quantitative model runs required to construct a high-performance prediction model. Our experiments on two datasets obtained from the sensitivity analysis performed for two disease screening modeling studies indicate that ensemble methods such as Random Forests and XGBoost consistently outperform other machine learning algorithms in the prediction task of the associated sensitivity analysis. In addition, we note that active learning can lead to significant speed-ups in sensitivity analysis by enabling the selection of more useful parameter combinations (i.e., instances) to be used for prediction models.

主动学习多途径敏感性分析在疾病筛选建模中的应用。
敏感性分析是模型开发的一个重要方面,因为它可以用来评估与研究结果相关的置信度水平。在许多实际问题中,敏感性分析涉及评估大量的参数组合,这可能需要大量的时间和资源。然而,这样的计算负担可以通过识别较小的参数组合子集来避免,这些子集可以稍后用于为其他参数组合生成所需的结果。在这项研究中,我们研究了基于机器学习的方法来加速灵敏度分析。此外,我们应用特征选择方法来识别定量模型参数对结果的预测能力的相对重要性。最后,我们强调了主动学习策略通过减少构建高性能预测模型所需的定量模型运行总数来改善敏感性分析过程的有效性。我们在两项疾病筛选建模研究的敏感性分析中获得的两个数据集上进行的实验表明,随机森林和XGBoost等集成方法在相关敏感性分析的预测任务中始终优于其他机器学习算法。此外,我们注意到主动学习可以通过选择更有用的参数组合(即实例)用于预测模型,从而导致灵敏度分析的显着加速。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Healthcare Informatics Research
Journal of Healthcare Informatics Research Computer Science-Computer Science Applications
CiteScore
13.60
自引率
1.70%
发文量
12
期刊介绍: Journal of Healthcare Informatics Research serves as a publication venue for the innovative technical contributions highlighting analytics, systems, and human factors research in healthcare informatics.Journal of Healthcare Informatics Research is concerned with the application of computer science principles, information science principles, information technology, and communication technology to address problems in healthcare, and everyday wellness. Journal of Healthcare Informatics Research highlights the most cutting-edge technical contributions in computing-oriented healthcare informatics.  The journal covers three major tracks: (1) analytics—focuses on data analytics, knowledge discovery, predictive modeling; (2) systems—focuses on building healthcare informatics systems (e.g., architecture, framework, design, engineering, and application); (3) human factors—focuses on understanding users or context, interface design, health behavior, and user studies of healthcare informatics applications.   Topics include but are not limited to: ·         healthcare software architecture, framework, design, and engineering;·         electronic health records·         medical data mining·         predictive modeling·         medical information retrieval·         medical natural language processing·         healthcare information systems·         smart health and connected health·         social media analytics·         mobile healthcare·         medical signal processing·         human factors in healthcare·         usability studies in healthcare·         user-interface design for medical devices and healthcare software·         health service delivery·         health games·         security and privacy in healthcare·         medical recommender system·         healthcare workflow management·         disease profiling and personalized treatment·         visualization of medical data·         intelligent medical devices and sensors·         RFID solutions for healthcare·         healthcare decision analytics and support systems·         epidemiological surveillance systems and intervention modeling·         consumer and clinician health information needs, seeking, sharing, and use·         semantic Web, linked data, and ontology·         collaboration technologies for healthcare·         assistive and adaptive ubiquitous computing technologies·         statistics and quality of medical data·         healthcare delivery in developing countries·         health systems modeling and simulation·         computer-aided diagnosis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信