Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing最新文献

筛选
英文 中文
Scalar-Function Causal Discovery for Generating Causal Hypotheses with Observational Wearable Device Data 利用可穿戴设备观测数据生成因果假设的标量函数因果发现
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0016
V. Rogovchenko, Austin Sibu, Yang Ni
{"title":"Scalar-Function Causal Discovery for Generating Causal Hypotheses with Observational Wearable Device Data","authors":"V. Rogovchenko, Austin Sibu, Yang Ni","doi":"10.1142/9789811286421_0016","DOIUrl":"https://doi.org/10.1142/9789811286421_0016","url":null,"abstract":"Digital health technologies such as wearable devices have transformed health data analytics, providing continuous, high-resolution functional data on various health metrics, thereby opening new avenues for innovative research. In this work, we introduce a new approach for generating causal hypotheses for a pair of a continuous functional variable (e.g., physical activities recorded over time) and a binary scalar variable (e.g., mobility condition indicator). Our method goes beyond traditional association-focused approaches and has the potential to reveal the underlying causal mechanism. We theoretically show that the proposed scalar-function causal model is identifiable with observational data alone. Our identifiability theory justifies the use of a simple yet principled algorithm to discern the causal relationship by comparing the likelihood functions of competing causal hypotheses. The robustness and applicability of our method are demonstrated through simulation studies and a real-world application using wearable device data from the National Health and Nutrition Examination Survey.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantifying Health Outcome Disparity in Invasive Methicillin-Resistant Staphylococcus aureus Infection using Fairness Algorithms on Real-World Data. 在真实世界数据上使用公平算法量化侵袭性耐甲氧西林金黄色葡萄球菌感染的健康结果差异。
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0032
Inyoung Jun, Sara Ser, Scott A. Cohen, Jie Xu, Robert J. Lucero, Jiang Bian, M. Prosperi
{"title":"Quantifying Health Outcome Disparity in Invasive Methicillin-Resistant Staphylococcus aureus Infection using Fairness Algorithms on Real-World Data.","authors":"Inyoung Jun, Sara Ser, Scott A. Cohen, Jie Xu, Robert J. Lucero, Jiang Bian, M. Prosperi","doi":"10.1142/9789811286421_0032","DOIUrl":"https://doi.org/10.1142/9789811286421_0032","url":null,"abstract":"This study quantifies health outcome disparities in invasive Methicillin-Resistant Staphylococcus aureus (MRSA) infections by leveraging a novel artificial intelligence (AI) fairness algorithm, the Fairness-Aware Causal paThs (FACTS) decomposition, and applying it to real-world electronic health record (EHR) data. We spatiotemporally linked 9 years of EHRs from a large healthcare provider in Florida, USA, with contextual social determinants of health (SDoH). We first created a causal structure graph connecting SDoH with individual clinical measurements before/upon diagnosis of invasive MRSA infection, treatments, side effects, and outcomes; then, we applied FACTS to quantify outcome potential disparities of different causal pathways including SDoH, clinical and demographic variables. We found moderate disparity with respect to demographics and SDoH, and all the top ranked pathways that led to outcome disparities in age, gender, race, and income, included comorbidity. Prior kidney impairment, vancomycin use, and timing were associated with racial disparity, while income, rurality, and available healthcare facilities contributed to gender disparity. From an intervention standpoint, our results highlight the necessity of devising policies that consider both clinical factors and SDoH. In conclusion, this work demonstrates a practical utility of fairness AI methods in public health settings.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
nSEA: n-Node Subnetwork Enumeration Algorithm Identifies Lower Grade Glioma Subtypes with Altered Subnetworks and Distinct Prognostics. nSEA:n节点子网络枚举算法可识别具有改变的子网络和不同预后的低级别胶质瘤亚型。
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0040
Zhihan Zhang, Christiana Wang, Ziyin Zhao, Ziyue Yi, Arda Durmaz, Jennifer S. Yu, G. Bebek
{"title":"nSEA: n-Node Subnetwork Enumeration Algorithm Identifies Lower Grade Glioma Subtypes with Altered Subnetworks and Distinct Prognostics.","authors":"Zhihan Zhang, Christiana Wang, Ziyin Zhao, Ziyue Yi, Arda Durmaz, Jennifer S. Yu, G. Bebek","doi":"10.1142/9789811286421_0040","DOIUrl":"https://doi.org/10.1142/9789811286421_0040","url":null,"abstract":"Advances in molecular characterization have reshaped our understanding of low-grade glioma (LGG) subtypes, emphasizing the need for comprehensive classification beyond histology. Lever-aging this, we present a novel approach, network-based Subnetwork Enumeration, and Analysis (nSEA), to identify distinct LGG patient groups based on dysregulated molecular pathways. Using gene expression profiles from 516 patients and a protein-protein interaction network we generated 25 million sub-networks. Through our unsupervised bottom-up approach, we selected 92 subnetworks that categorized LGG patients into five groups. Notably, a new LGG patient group with a lack of mutations in EGFR, NF1, and PTEN emerged as a previously unidentified patient subgroup with unique clinical features and subnetwork states. Validation of the patient groups on an independent dataset demonstrated the robustness of our approach and revealed consistent survival traits across different patient populations. This study offers a comprehensive molecular classification of LGG, providing insights beyond traditional genetic markers. By integrating network analysis with patient clustering, we unveil a previously overlooked patient subgroup with potential implications for prognosis and treatment strategies. Our approach sheds light on the synergistic nature of driver genes and highlights the biological relevance of the identified subnetworks. With broad implications for glioma research, our findings pave the way for further investigations into the mechanistic underpinnings of LGG subtypes and their clinical relevance.Availability: Source code and supplementary data are available at https://github.com/bebeklab/nSEA.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SynTwin: A graph-based approach for predicting clinical outcomes using digital twins derived from synthetic patients. SynTwin:一种基于图谱的方法,利用从合成患者中提取的数字双胞胎预测临床结果。
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0008
Jason H. Moore, Xi Li, Jui-Hsuan Chang, Nicholas P. Tatonetti, Dan Theodorescu, Yong Chen, F. Asselbergs, Mythreye Venkatesan, Zhiping Wang
{"title":"SynTwin: A graph-based approach for predicting clinical outcomes using digital twins derived from synthetic patients.","authors":"Jason H. Moore, Xi Li, Jui-Hsuan Chang, Nicholas P. Tatonetti, Dan Theodorescu, Yong Chen, F. Asselbergs, Mythreye Venkatesan, Zhiping Wang","doi":"10.1142/9789811286421_0008","DOIUrl":"https://doi.org/10.1142/9789811286421_0008","url":null,"abstract":"The concept of a digital twin came from the engineering, industrial, and manufacturing domains to create virtual objects or machines that could inform the design and development of real objects. This idea is appealing for precision medicine where digital twins of patients could help inform healthcare decisions. We have developed a methodology for generating and using digital twins for clinical outcome prediction. We introduce a new approach that combines synthetic data and network science to create digital twins (i.e. SynTwin) for precision medicine. First, our approach starts by estimating the distance between all subjects based on their available features. Second, the distances are used to construct a network with subjects as nodes and edges defining distance less than the percolation threshold. Third, communities or cliques of subjects are defined. Fourth, a large population of synthetic patients are generated using a synthetic data generation algorithm that models the correlation structure of the data to generate new patients. Fifth, digital twins are selected from the synthetic patient population that are within a given distance defining a subject community in the network. Finally, we compare and contrast community-based prediction of clinical endpoints using real subjects, digital twins, or both within and outside of the community. Key to this approach are the digital twins defined using patient similarity that represent hypothetical unobserved patients with patterns similar to nearby real patients as defined by network distance and community structure. We apply our SynTwin approach to predicting mortality in a population-based cancer registry (n=87,674) from the Surveillance, Epidemiology, and End Results (SEER) program from the National Cancer Institute (USA). Our results demonstrate that nearest network neighbor prediction of mortality in this study is significantly improved with digital twins (AUROC=0.864, 95% CI=0.857-0.872) over just using real data alone (AUROC=0.791, 95% CI=0.781-0.800). These results suggest a network-based digital twin strategy using synthetic patients may add value to precision medicine efforts.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cluster Analysis reveals Socioeconomic Disparities among Elective Spine Surgery Patients. 聚类分析揭示了脊柱外科择期手术患者的社会经济差异。
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0028
Alena Orlenko, P. Freda, Attri Ghosh, Hyunjun Choi, Nicholas Matsumoto, T. Bright, Corey T. Walker, Tayo Obafemi-Ajayi, Jason H. Moore
{"title":"Cluster Analysis reveals Socioeconomic Disparities among Elective Spine Surgery Patients.","authors":"Alena Orlenko, P. Freda, Attri Ghosh, Hyunjun Choi, Nicholas Matsumoto, T. Bright, Corey T. Walker, Tayo Obafemi-Ajayi, Jason H. Moore","doi":"10.1142/9789811286421_0028","DOIUrl":"https://doi.org/10.1142/9789811286421_0028","url":null,"abstract":"This work demonstrates the use of cluster analysis in detecting fair and unbiased novel discoveries. Given a sample population of elective spinal fusion patients, we identify two overarching subgroups driven by insurance type. The Medicare group, associated with lower socioeconomic status, exhibited an over-representation of negative risk factors. The findings provide a compelling depiction of the interwoven socioeconomic and racial disparities present within the healthcare system, highlighting their consequential effects on health inequalities. The results are intended to guide design of fair and precise machine learning models based on intentional integration of population stratification.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Conversational Agent for Early Detection of Neurotoxic Effects of Medications through Automated Intensive Observation. 通过自动强化观察及早发现药物神经毒性效应的对话式代理。
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0003
Serguei V. S. Pakhomov, Jacob Solinsky, Martin Michalowski, Veronika Bachanova
{"title":"A Conversational Agent for Early Detection of Neurotoxic Effects of Medications through Automated Intensive Observation.","authors":"Serguei V. S. Pakhomov, Jacob Solinsky, Martin Michalowski, Veronika Bachanova","doi":"10.1142/9789811286421_0003","DOIUrl":"https://doi.org/10.1142/9789811286421_0003","url":null,"abstract":"We present a fully automated AI-based system for intensive monitoring of cognitive symptoms of neurotoxicity that frequently appear as a result of immunotherapy of hematologic malignancies. Early manifestations of these symptoms are evident in the patient's speech in the form of mild aphasia and confusion and can be detected and effectively treated prior to onset of more serious and potentially life-threatening impairment. We have developed the Automated Neural Nursing Assistant (ANNA) system designed to conduct a brief cognitive assessment several times per day over the telephone for 5-14 days following infusion of the immunotherapy medication. ANNA uses a conversational agent based on a large language model to elicit spontaneous speech in a semi-structured dialogue, followed by a series of brief language-based neurocognitive tests. In this paper we share ANNA's design and implementation, results of a pilot functional evaluation study, and discuss technical and logistic challenges facing the introduction of this type of technology in clinical practice. A large-scale clinical evaluation of ANNA will be conducted in an observational study of patients undergoing immunotherapy at the University of Minnesota Masonic Cancer Center starting in the Fall 2023.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging 3D Echocardiograms to Evaluate AI Model Performance in Predicting Cardiac Function on Out-of-Distribution Data. 利用三维超声心动图评估人工智能模型在分布外数据上预测心功能的性能。
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0004
Grant Duffy, Kai Christensen, David Ouyang
{"title":"Leveraging 3D Echocardiograms to Evaluate AI Model Performance in Predicting Cardiac Function on Out-of-Distribution Data.","authors":"Grant Duffy, Kai Christensen, David Ouyang","doi":"10.1142/9789811286421_0004","DOIUrl":"https://doi.org/10.1142/9789811286421_0004","url":null,"abstract":"Advancements in medical imaging and artificial intelligence (AI) have revolutionized the field of cardiac diagnostics, providing accurate and efficient tools for assessing cardiac function. AI diagnostics claims to improve upon the human-to-human variation that is known to be significant. However, when put in practice, for cardiac ultrasound, AI models are being run on images acquired by human sonographers whose quality and consistency may vary. With more variation than other medical imaging modalities, variation in image acquisition may lead to out-of-distribution (OOD) data and unpredictable performance of the AI tools. Recent advances in ultrasound technology has allowed the acquisition of both 3D as well as 2D data, however 3D has more limited temporal and spatial resolution and is still not routinely acquired. Because the training datasets used when developing AI algorithms are mostly developed using 2D images, it is difficult to determine the impact of human variation on the performance of AI tools in the real world. The objective of this project is to leverage 3D echos to simulate realistic human variation of image acquisition and better understand the OOD performance of a previously validated AI model. In doing so, we develop tools for interpreting 3D echo data and quantifiably recreating common variation in image acquisition between sonographers. We also developed a technique for finding good standard 2D views in 3D echo volumes. We found the performance of the AI model we evaluated to be as expected when the view is good, but variations in acquisition position degraded AI model performance. Performance on far from ideal views was poor, but still better than random, suggesting that there is some information being used that permeates the whole volume, not just a quality view. Additionally, we found that variations in foreshortening didn't result in the same errors that a human would make.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LARGE LANGUAGE MODELS (LLMS) AND CHATGPT FOR BIOMEDICINE. 用于生物医学的大型语言模型(LLMS)和聊天软件。
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0048
Cecilia Arighi, Steven E. Brenner, Zhiyong Lu
{"title":"LARGE LANGUAGE MODELS (LLMS) AND CHATGPT FOR BIOMEDICINE.","authors":"Cecilia Arighi, Steven E. Brenner, Zhiyong Lu","doi":"10.1142/9789811286421_0048","DOIUrl":"https://doi.org/10.1142/9789811286421_0048","url":null,"abstract":"Large Language Models (LLMs) are a type of artificial intelligence that has been revolutionizing various fields, including biomedicine. They have the capability to process and analyze large amounts of data, understand natural language, and generate new content, making them highly desirable in many biomedical applications and beyond. In this workshop, we aim to introduce the attendees to an in-depth understanding of the rise of LLMs in biomedicine, and how they are being used to drive innovation and improve outcomes in the field, along with associated challenges and pitfalls.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LA-GEM: imputation of gene expression with incorporation of Local Ancestry LA-GEM:结合当地血统的基因表达估算
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-17 DOI: 10.1142/9789811286421_0027
Mrinal Mishra, Layan Nahlawi, Yizhen Zhong, T. De, Guang Yang, Cristina Alarcon, M. Perera
{"title":"LA-GEM: imputation of gene expression with incorporation of Local Ancestry","authors":"Mrinal Mishra, Layan Nahlawi, Yizhen Zhong, T. De, Guang Yang, Cristina Alarcon, M. Perera","doi":"10.1142/9789811286421_0027","DOIUrl":"https://doi.org/10.1142/9789811286421_0027","url":null,"abstract":"Gene imputation and TWAS have become a staple in the genomics medicine discovery space; helping to identify genes whose regulation effects may contribute to disease susceptibility. However, the cohorts on which these methods are built are overwhelmingly of European Ancestry. This means that the unique regulatory variation that exist in non-European populations, specifically African Ancestry populations, may not be included in the current models. Moreover, African Americans are an admixed population, with a mix of European and African segments within their genome. No gene imputation model thus far has incorporated the effect of local ancestry (LA) on gene expression imputation. As such, we created LA-GEM which was trained and tested on a cohort of 60 African American hepatocyte primary cultures. Uniquely, LA-GEM include local ancestry inference in its prediction of gene expression. We compared the performance of LA-GEM to PrediXcan trained the same dataset (with no inclusion of local ancestry) We were able to reliably predict the expression of 2559 genes (1326 in LA-GEM and 1236 in PrediXcan). Of these, 546 genes were unique to LA-GEM, including the CYP3A5 gene which is critical to drug metabolism. We conducted TWAS analysis on two African American clinical cohorts with pharmacogenomics phenotypic information to identity novel gene associations. In our IWPC warfarin cohort, we identified 17 transcriptome-wide significant hits. No gene reached are prespecified significance level in the clopidogrel cohort. We did see suggestive association with RAS3A to P2RY12 Reactivity Units (PRU), a clinical measure of response to anti-platelet therapy. This method demonstrated the need for the incorporation of LA into study in admixed populations.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zoish: A Novel Feature Selection Approach Leveraging Shapley Additive Values for Machine Learning Applications in Healthcare Zoish:利用夏普利加法值为医疗保健领域的机器学习应用提供新颖的特征选择方法
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Pub Date : 2023-12-15 DOI: 10.1142/9789811286421_0007
Hossein Javedani Sadaei, Salvatore Loguercio, Mahdi Shafiei Neyestanak, Ali Torkamani, Daria Prilutsky
{"title":"Zoish: A Novel Feature Selection Approach Leveraging Shapley Additive Values for Machine Learning Applications in Healthcare","authors":"Hossein Javedani Sadaei, Salvatore Loguercio, Mahdi Shafiei Neyestanak, Ali Torkamani, Daria Prilutsky","doi":"10.1142/9789811286421_0007","DOIUrl":"https://doi.org/10.1142/9789811286421_0007","url":null,"abstract":"In the intricate landscape of healthcare analytics, effective feature selection is a prerequisite for generating robust predictive models, especially given the common challenges of sample sizes and potential biases. Zoish uniquely addresses these issues by employing Shapley additive values—an idea rooted in cooperative game theory—to enable both transparent and automated feature selection. Unlike existing tools, Zoish is versatile, designed to seamlessly integrate with an array of machine learning libraries including scikit-learn, XGBoost, CatBoost, and imbalanced-learn. The distinct advantage of Zoish lies in its dual algorithmic approach for calculating Shapley values, allowing it to efficiently manage both large and small datasets. This adaptability renders it exceptionally suitable for a wide spectrum of healthcare-related tasks. The tool also places a strong emphasis on interpretability, providing comprehensive visualizations for analyzed features. Its customizable settings offer users fine-grained control over feature selection, thus optimizing for specific predictive objectives. This manuscript elucidates the mathematical framework underpinning Zoish and how it uniquely combines local and global feature selection into a single, streamlined process. To validate Zoish’s efficiency and adaptability, we present case studies in breast cancer prediction and Montreal Cognitive Assessment (MoCA) prediction in Parkinson’s disease, along with evaluations on 300 synthetic datasets. These applications underscore Zoish’s unparalleled performance in diverse healthcare contexts and against its counterparts.","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139176701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信