Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics最新文献

筛选
英文 中文
Identifying patient-specific root causes of disease 确定患者特有的疾病根源
E. Strobl, T. Lasko
{"title":"Identifying patient-specific root causes of disease","authors":"E. Strobl, T. Lasko","doi":"10.1145/3535508.3545553","DOIUrl":"https://doi.org/10.1145/3535508.3545553","url":null,"abstract":"Complex diseases are caused by a multitude of factors that may differ between patients. As a result, hypothesis tests comparing all patients to all healthy controls can detect many significant variables with inconsequential effect sizes. A few highly predictive root causes may nevertheless generate disease within each patient. In this paper, we define patient-specific root causes as variables subject to exogenous \"shocks\" which go on to perturb an otherwise healthy system and induce disease. In other words, the variables are associated with the exogenous errors of a structural equation model (SEM), and these errors predict a downstream diagnostic label. We quantify predictivity using sample-specific Shapley values. This derivation allows us to develop a fast algorithm called Root Causal Inference for identifying patient-specific root causes by extracting the error terms of a linear SEM and then computing the Shapley value associated with each error. Experiments highlight considerable improvements in accuracy because the method uncovers root causes that may have large effect sizes at the individual level but clinically insignificant effect sizes at the group level. An R implementation is available at github.com/ericstrobl/RCI.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124924315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Transparent single-cell set classification with kernel mean embeddings 基于核均值嵌入的透明单细胞集分类
Siyuan Shan, Vishal Baskaran, Haidong Yi, Jolene S Ranek, N. Stanley, Junier B. Oliva
{"title":"Transparent single-cell set classification with kernel mean embeddings","authors":"Siyuan Shan, Vishal Baskaran, Haidong Yi, Jolene S Ranek, N. Stanley, Junier B. Oliva","doi":"10.1145/3535508.3545538","DOIUrl":"https://doi.org/10.1145/3535508.3545538","url":null,"abstract":"Modern single-cell flow and mass cytometry technologies measure the expression of several proteins of the individual cells within a blood or tissue sample. Each profiled biological sample is thus represented by a set of hundreds of thousands of multidimensional cell feature vectors, which incurs a high computational cost to predict each biological sample's associated phenotype with machine learning models. Such a large set cardinality also limits the interpretability of machine learning models due to the difficulty in tracking how each individual cell influences the ultimate prediction. We propose using Kernel Mean Embedding to encode the cellular landscape of each profiled biological sample. Although our foremost goal is to make a more transparent model, we find that our method achieves comparable or better accuracies than the state-of-the-art gating-free methods through a simple linear classifier. As a result, our model contains few parameters but still performs similarly to deep learning models with millions of parameters. In contrast with deep learning approaches, the linearity and sub-selection step of our model makes it easy to interpret classification results. Analysis further shows that our method admits rich biological interpretability for linking cellular heterogeneity to clinical phenotype.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129096484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Self-explaining neural network with concept-based explanations for ICU mortality prediction 基于概念解释的自解释神经网络在ICU死亡率预测中的应用
Sayantan Kumar, Sean C. Yu, T. Kannampallil, Zachary B. Abrams, A. Michelson, Philip R. O. Payne
{"title":"Self-explaining neural network with concept-based explanations for ICU mortality prediction","authors":"Sayantan Kumar, Sean C. Yu, T. Kannampallil, Zachary B. Abrams, A. Michelson, Philip R. O. Payne","doi":"10.1145/3535508.3545547","DOIUrl":"https://doi.org/10.1145/3535508.3545547","url":null,"abstract":"Complex deep learning models show high prediction tasks in various clinical prediction tasks but their inherent complexity makes it more challenging to explain model predictions for clinicians and healthcare providers. Existing research on explainability of deep learning models in healthcare have two major limitations: using post-hoc explanations and using raw clinical variables as units of explanation, both of which are often difficult for human interpretation. In this work, we designed a self-explaining deep learning framework using the expert-knowledge driven clinical concepts or intermediate features as units of explanation. The self-explaining nature of our proposed model comes from generating both explanations and predictions within the same architectural framework via joint training. We tested our proposed approach on a publicly available Electronic Health Records (EHR) dataset for predicting patient mortality in the ICU. In order to analyze the performance-interpretability trade-off, we compared our proposed model with a baseline having the same set-up but without the explanation components. Experimental results suggest that adding explainability components to a deep learning framework does not impact prediction performance and the explanations generated by the model can provide insights to the clinicians to understand the possible reasons behind patient mortality.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127269667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
SurvTRACE: transformers for survival analysis with competing events SurvTRACE:用于竞争事件生存分析的变形器
Zifeng Wang, Jimeng Sun
{"title":"SurvTRACE: transformers for survival analysis with competing events","authors":"Zifeng Wang, Jimeng Sun","doi":"10.1145/3535508.3545521","DOIUrl":"https://doi.org/10.1145/3535508.3545521","url":null,"abstract":"In medicine, survival analysis studies the time duration to events of interest such as mortality. One major challenge is how to deal with multiple competing events (e.g., multiple disease diagnoses). In this work, we propose a transformer-based model that does not make the assumption for the underlying survival distribution and is capable of handling competing events, namely SurvTRACE. We account for the implicit confounders in the observational setting in multi-events scenarios, which causes selection bias as the predicted survival probability is influenced by irrelevant factors. To sufficiently utilize the survival data to train transformers from scratch, multiple auxiliary tasks are designed for multi-task learning. The model hence learns a strong shared representation from all these tasks and in turn serves for better survival analysis. We further demonstrate how to inspect the covariate relevance and importance through interpretable attention mechanisms of SurvTRACE, which suffices to great potential in enhancing clinical trial design and new treatment development. Experiments on METABRIC, SUPPORT, and SEER data with 470k patients validate the all-around superiority of our method. Software is available at https://github.com/RyanWangZf/SurvTRACE.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133081075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Attention-based aspect reasoning for knowledge base question answering on clinical notes 临床笔记知识库问答的注意方面推理
Ping Wang, Tian Shi, Khushbu Agarwal, Sutanay Choudhury, C. Reddy
{"title":"Attention-based aspect reasoning for knowledge base question answering on clinical notes","authors":"Ping Wang, Tian Shi, Khushbu Agarwal, Sutanay Choudhury, C. Reddy","doi":"10.1145/3535508.3545518","DOIUrl":"https://doi.org/10.1145/3535508.3545518","url":null,"abstract":"Question Answering (QA) in clinical notes has gained a lot of attention in the past few years. Existing machine reading comprehension approaches in clinical domain can only handle questions about a single block of clinical texts and fail to retrieve information about multiple patients and their clinical notes. To handle more complex questions, we aim at creating knowledge base from clinical notes to link different patients and clinical notes, and performing knowledge base question answering (KBQA). Based on the expert annotations available in the n2c2 dataset, we first created the ClinicalKBQA dataset that includes around 9K QA pairs and covers questions about seven medical topics using more than 300 question templates. Then, we investigated an attention-based aspect reasoning (AAR) method for KBQA and analyzed the impact of different aspects of answers (e.g., entity, type, path, and context) for prediction. The AAR method achieves better performance due to the well-designed encoder and attention mechanism. From our experiments, we find that both aspects, type and path, enable the model to identify answers satisfying the general conditions and produce lower precision and higher recall. On the other hand, the aspects, entity and context, limit the answers by node-specific information and lead to higher precision and lower recall.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"195 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116146725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Antibiotic resistance prediction and biomarker discovery in Neisseria gonorrhoeae 淋病奈瑟菌抗生素耐药性预测及生物标志物发现
R. Goyal, Rashmi Chowdhary
{"title":"Antibiotic resistance prediction and biomarker discovery in Neisseria gonorrhoeae","authors":"R. Goyal, Rashmi Chowdhary","doi":"10.1145/3535508.3545097","DOIUrl":"https://doi.org/10.1145/3535508.3545097","url":null,"abstract":"Antibiotic resistance is a global problem projected to kill 10 million each year by 2050. The CDC lists Neisseria gonorrhoeae among the most urgent threats in this area as there exists a severe lack of efficient resistance detection techniques and only a handful of resistance-causing mutations have been identified thus far [2]. Currently, testing for antibiotic resistance in N. gonorrhoeae samples depends on culturing a sample in a lab environment. Sensitivity and specificity may reach 85--95% and 100% respectively, but only under optimal conditions and for urogenital specimens [3]. In this study, eight machine learning models - multi-layer perceptron, support vector machine, random forest classifier, K-nearest neighbors, eXtreme gradient boosting, Gaussian Naive Bayes, stochastic gradient descent, and logistic regression - were trained on three datasets containing data regarding resistance against azithromycin, ciprofloxacin and cefixime, which are three drugs of choice against N. gonorrhoeae. Each dataset had 3000+ samples and their corresponding resistance values; each sample consisted of a binary series representing the presence/absence of certain unitigs within that sample's genome. The technique differs from the standard research in this field, which has almost exclusively used whole-genome sequences. Once the models were trained, their accuracies, sensitivities and specificities were compared and analyzed. Maximum balanced accuracies of 97.6%, 95.9% and 100% were achieved on azithromycin, ciprofloxacin and cefixime training data respectively, exhibiting an improvement over previous work [4]. As a point of comparison between various models, performance on azithromycin resistance is represented in Fig 1. The balanced accuracy of GNB, at 68%, is too low to register on the scale. Subsequently, Fisher's exact test was used to test for the existence of biomarkers, i.e. unitigs that had a statistically significant correlation with antibiotic resistance. The feature importances of the top models from the first step were used to create a ranking of these genetic signatures, representing a novel method of unitig organization. Out of 584,362 unitigs, 191, 3304 and 1 were identified as statistically significant for azithromycin, ciprofloxacin and cefixime respectively. The majority of these genetic regions encode for proteins - some of which are likely novel discoveries - such as DsbA oxidoreductase, FtsJ methyltransferase, and Pilin glycosyltransferase. These biomarkers present useful leads for the development of point-of-care tests for antibiotic resistance in N. gonorrhoeae, while the ML models can predict resistance through direct genotype sequencing of patient samples [1].","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130337921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A general kernel boosting framework integrating pathways for predictive modeling based on genomic data 基于基因组数据的预测建模路径集成的通用核增强框架
Li Zeng, Zhaolong Yu, Yiliang Zhang, Hongyu Zhao
{"title":"A general kernel boosting framework integrating pathways for predictive modeling based on genomic data","authors":"Li Zeng, Zhaolong Yu, Yiliang Zhang, Hongyu Zhao","doi":"10.1145/3535508.3545526","DOIUrl":"https://doi.org/10.1145/3535508.3545526","url":null,"abstract":"In this article, we extend a general framework, Pathway-based Kernel Boosting (PKB), which incorporates clinical information and prior knowledge about pathways for prediction of binary, continuous and survival outcomes. We introduce appropriate loss functions and optimization procedures for different outcome types. Our prediction algorithm incorporates pathway knowledge by constructing kernel function spaces from the pathways and use them as base learners in the boosting procedure. Through extensive simulations and case studies in drug response and cancer survival datasets, we demonstrate that PKB can substantially outperform other competing methods, better identify biological pathways related to drug response and patient survival, and provide novel insights into cancer pathogenesis and treatment response.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127321049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信