KDD : proceedings. International Conference on Knowledge Discovery & Data Mining最新文献

筛选
英文 中文
SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing. 败血症实验室:利用不确定性量化和主动传感技术进行早期败血症预测。
KDD : proceedings. International Conference on Knowledge Discovery & Data Mining Pub Date : 2024-08-01 Epub Date: 2024-08-24 DOI: 10.1145/3637528.3671586
Changchang Yin, Pin-Yu Chen, Bingsheng Yao, Dakuo Wang, Jeffrey Caterino, Ping Zhang
{"title":"SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing.","authors":"Changchang Yin, Pin-Yu Chen, Bingsheng Yao, Dakuo Wang, Jeffrey Caterino, Ping Zhang","doi":"10.1145/3637528.3671586","DOIUrl":"https://doi.org/10.1145/3637528.3671586","url":null,"abstract":"<p><p>Sepsis is the leading cause of in-hospital mortality in the USA. Early sepsis onset prediction and diagnosis could significantly improve the survival of sepsis patients. Existing predictive models are usually trained on high-quality data with few missing information, while missing values widely exist in real-world clinical scenarios (especially in the first hours of admissions to the hospital), which causes a significant decrease in accuracy and an increase in uncertainty for the predictive models. The common method to handle missing values is imputation, which replaces the unavailable variables with estimates from the observed data. The uncertainty of imputation results can be propagated to the sepsis prediction outputs, which have not been studied in existing works on either sepsis prediction or uncertainty quantification. In this study, we first define such propagated uncertainty as the variance of prediction output and then introduce uncertainty propagation methods to quantify the propagated uncertainty. Moreover, for the potential high-risk patients with low confidence due to limited observations, we propose a robust active sensing algorithm to increase confidence by actively recommending clinicians to observe the most informative variables. We validate the proposed models in both publicly available data (i.e., MIMIC-III and AmsterdamUMCdb) and proprietary data in The Ohio State University Wexner Medical Center (OSUWMC). The experimental results show that the propagated uncertainty is dominant at the beginning of admissions to hospitals and the proposed algorithm outperforms state-of-the-art active sensing methods. Finally, we implement a SepsisLab system for early sepsis prediction and active sensing based on our pre-trained models. Clinicians and potential sepsis patients can benefit from the system in early prediction and diagnosis of sepsis.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2024 ","pages":"6158-6168"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11470769/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data.
KDD : proceedings. International Conference on Knowledge Discovery & Data Mining Pub Date : 2024-08-01 Epub Date: 2024-08-24 DOI: 10.1145/3637528.3671594
Ziyang Zhang, Hejie Cui, Ran Xu, Yuzhang Xie, Joyce C Ho, Carl Yang
{"title":"TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data.","authors":"Ziyang Zhang, Hejie Cui, Ran Xu, Yuzhang Xie, Joyce C Ho, Carl Yang","doi":"10.1145/3637528.3671594","DOIUrl":"10.1145/3637528.3671594","url":null,"abstract":"<p><p>The growing availability of well-organized Electronic Health Records (EHR) data has enabled the development of various machine learning models towards disease risk prediction. However, existing risk prediction methods overlook the heterogeneity of complex diseases, failing to model the potential disease subtypes regarding their corresponding patient visits and clinical concept subgroups. In this work, we introduce <b>TACCO</b>, a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data. Specifically, we develop a novel self-supervised co-clustering framework that can be guided by the risk prediction task of specific diseases. Furthermore, we enhance the hypergraph model of EHR data with textual embeddings and enforce the alignment between the clusters of clinical concepts and patient visits through a contrastive objective. Comprehensive experiments conducted on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction demonstrate an average 31.25% performance improvement compared to traditional ML baselines and a 5.26% improvement on top of the vanilla hypergraph model without our co-clustering mechanism. In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by <b>TACCO</b>. Code is available at https://github.com/PericlesHat/TACCO.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2024 ","pages":"6324-6334"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11868038/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143544972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Harmonization: Federated Clustered Batch Effect Adjustment and Generalization. 分布式协调:联邦集群批量效应调整和泛化。
KDD : proceedings. International Conference on Knowledge Discovery & Data Mining Pub Date : 2024-01-01 Epub Date: 2024-08-24 DOI: 10.1145/3637528.3671590
Bao Hoang, Yijiang Pang, Siqi Liang, Liang Zhan, Paul M Thompson, Jiayu Zhou
{"title":"Distributed Harmonization: Federated Clustered Batch Effect Adjustment and Generalization.","authors":"Bao Hoang, Yijiang Pang, Siqi Liang, Liang Zhan, Paul M Thompson, Jiayu Zhou","doi":"10.1145/3637528.3671590","DOIUrl":"10.1145/3637528.3671590","url":null,"abstract":"<p><p>Independent and identically distributed (<i>i.i.d.</i>) data is essential to many data analysis and modeling techniques. In the medical domain, collecting data from multiple sites or institutions is a common strategy that guarantees sufficient clinical diversity, determined by the decentralized nature of medical data. However, data from various sites are easily biased by the local environment or facilities, thereby violating the <i>i.i.d.</i> rule. A common strategy is to harmonize the site bias while retaining important biological information. The COMBAT is among the most popular harmonization approaches and has recently been extended to handle distributed sites. However, when faced with situations involving newly joined sites in training or evaluating data from unknown/unseen sites, COMBAT lacks compatibility and requires retraining with data from all the sites. The retraining leads to significant computational and logistic overhead that is usually prohibitive. In this work, we develop a novel <i>Cluster ComBat</i> harmonization algorithm, which leverages cluster patterns of the data in different sites and greatly advances the usability of COMBAT harmonization. We use extensive simulation and real medical imaging data from ADNI to demonstrate the superiority of the proposed approach. Our codes are provided in https://github.com/illidanlab/distributed-cluster-harmonization.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2024 ","pages":"5105-5115"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11529347/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MolSearch: Search-based Multi-objective Molecular Generation and Property Optimization. MolSearch:基于搜索的多目标分子生成和性能优化。
Mengying Sun, Huijun Wang, Jing Xing, Bin Chen, Han Meng, Jiayu Zhou
{"title":"MolSearch: Search-based Multi-objective Molecular Generation and Property Optimization.","authors":"Mengying Sun,&nbsp;Huijun Wang,&nbsp;Jing Xing,&nbsp;Bin Chen,&nbsp;Han Meng,&nbsp;Jiayu Zhou","doi":"10.1145/3534678.3542676","DOIUrl":"https://doi.org/10.1145/3534678.3542676","url":null,"abstract":"<p><p>Leveraging computational methods to generate small molecules with desired properties has been an active research area in the drug discovery field. Towards real-world applications, however, efficient generation of molecules that satisfy <b>multiple</b> property requirements simultaneously remains a key challenge. In this paper, we tackle this challenge using a search-based approach and propose a simple yet effective framework called MolSearch for multi-objective molecular generation (optimization). We show that given proper design and sufficient information, search-based methods can achieve performance comparable or even better than deep learning methods while being computationally efficient. Such efficiency enables massive exploration of chemical space given constrained computational resources. In particular, MolSearch starts with existing molecules and uses a two-stage search strategy to gradually modify them into new ones, based on transformation rules derived systematically and exhaustively from large compound libraries. We evaluate MolSearch in multiple benchmark generation settings and demonstrate its effectiveness and efficiency.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2022 ","pages":"4724-4732"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10097503/pdf/nihms-1888099.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9580527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Deconfounding Actor-Critic Network with Policy Adaptation for Dynamic Treatment Regimes. 基于动态治疗机制的政策适应解构行为者-批评者网络。
KDD : proceedings. International Conference on Knowledge Discovery & Data Mining Pub Date : 2022-08-01 Epub Date: 2022-08-13 DOI: 10.1145/3534678.3539413
Changchang Yin, Ruoqi Liu, Jeffrey Caterino, Ping Zhang
{"title":"Deconfounding Actor-Critic Network with Policy Adaptation for Dynamic Treatment Regimes.","authors":"Changchang Yin,&nbsp;Ruoqi Liu,&nbsp;Jeffrey Caterino,&nbsp;Ping Zhang","doi":"10.1145/3534678.3539413","DOIUrl":"https://doi.org/10.1145/3534678.3539413","url":null,"abstract":"<p><p>Despite intense efforts in basic and clinical research, an individualized ventilation strategy for critically ill patients remains a major challenge. Recently, dynamic treatment regime (DTR) with reinforcement learning (RL) on electronic health records (EHR) has attracted interest from both the healthcare industry and machine learning research community. However, most learned DTR policies might be biased due to the existence of confounders. Although some treatment actions non-survivors received may be helpful, if confounders cause the mortality, the training of RL models guided by long-term outcomes (e.g., 90-day mortality) would punish those treatment actions causing the learned DTR policies to be suboptimal. In this study, we develop a new deconfounding actor-critic network (DAC) to learn optimal DTR policies for patients. To alleviate confounding issues, we incorporate a patient resampling module and a confounding balance module into our actor-critic framework. To avoid punishing the effective treatment actions non-survivors received, we design a short-term reward to capture patients' immediate health state changes. Combining short-term with long-term rewards could further improve the model performance. Moreover, we introduce a policy adaptation method to successfully transfer the learned model to new-source small-scale datasets. The experimental results on one semi-synthetic and two different real-world datasets show the proposed model outperforms the state-of-the-art models. The proposed model provides individualized treatment decisions for mechanical ventilation that could improve patient outcomes.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":" ","pages":"2316-2326"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9466407/pdf/nihms-1830314.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40354004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Predicting Age-Related Macular Degeneration Progression with Contrastive Attention and Time-Aware LSTM. 对比注意和时间感知LSTM预测年龄相关性黄斑变性进展。
Changchang Yin, Sayoko E Moroi, Ping Zhang
{"title":"Predicting Age-Related Macular Degeneration Progression with Contrastive Attention and Time-Aware LSTM.","authors":"Changchang Yin,&nbsp;Sayoko E Moroi,&nbsp;Ping Zhang","doi":"10.1145/3534678.3539163","DOIUrl":"https://doi.org/10.1145/3534678.3539163","url":null,"abstract":"<p><p>Age-related macular degeneration (AMD) is the leading cause of irreversible blindness in developed countries. Identifying patients at high risk of progression to late AMD, the sight-threatening stage, is critical for clinical actions, including medical interventions and timely monitoring. Recently, deep-learning-based models have been developed and achieved superior performance for late AMD prediction. However, most existing methods are limited to the color fundus photography (CFP) from the last ophthalmic visit and do not include the longitudinal CFP history and AMD progression during the previous years' visits. Patients in different AMD subphenotypes might have various speeds of progression in different stages of AMD disease. Capturing the progression information during the previous years' visits might be useful for the prediction of AMD progression. In this work, we propose a <b>C</b>ontrastive-<b>A</b>ttention-based <b>T</b>ime-aware <b>L</b>ong <b>S</b>hort-<b>T</b>erm <b>M</b>emory network (<b>CAT-LSTM</b>) to predict AMD progression. First, we adopt a convolutional neural network (CNN) model with a contrastive attention module (CA) to extract abnormal features from CFPs. Then we utilize a time-aware LSTM (T-LSTM) to model the patients' history and consider the AMD progression information. The combination of disease progression, genotype information, demographics, and CFP features are sent to T-LSTM. Moreover, we leverage an auto-encoder to represent temporal CFP sequences as fixed-size vectors and adopt k-means to cluster them into subphenotypes. We evaluate the proposed model based on real-world datasets, and the results show that the proposed model could achieve 0.925 on area under the receiver operating characteristic (AUROC) for 5-year late-AMD prediction and outperforms the state-of-the-art methods by more than 3%, which demonstrates the effectiveness of the proposed CAT-LSTM. After analyzing patient representation learned by an auto-encoder, we identify 3 novel subphenotypes of AMD patients with different characteristics and progression rates to late AMD, paving the way for improved personalization of AMD management. The code of CAT-LSTM can be found at GitHub.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2022 ","pages":"4402-4412"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9505703/pdf/nihms-1830315.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9234125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MoCL: Data-driven Molecular Fingerprint via Knowledge-aware Contrastive Learning from Molecular Graph. MoCL:通过分子图谱的知识感知对比学习实现数据驱动的分子指纹。
KDD : proceedings. International Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-01 Epub Date: 2021-08-14 DOI: 10.1145/3447548.3467186
Mengying Sun, Jing Xing, Huijun Wang, Bin Chen, Jiayu Zhou
{"title":"MoCL: Data-driven Molecular Fingerprint via Knowledge-aware Contrastive Learning from Molecular Graph.","authors":"Mengying Sun, Jing Xing, Huijun Wang, Bin Chen, Jiayu Zhou","doi":"10.1145/3447548.3467186","DOIUrl":"10.1145/3447548.3467186","url":null,"abstract":"<p><p>Recent years have seen a rapid growth of utilizing graph neural networks (GNNs) in the biomedical domain for tackling drug-related problems. However, like any other deep architectures, GNNs are data hungry. While requiring labels in real world is often expensive, pretraining GNNs in an unsupervised manner has been actively explored. Among them, graph contrastive learning, by maximizing the mutual information between paired graph augmentations, has been shown to be effective on various downstream tasks. However, the current graph contrastive learning framework has two limitations. First, the augmentations are designed for general graphs and thus may not be suitable or powerful enough for certain domains. Second, the contrastive scheme only learns representations that are invariant to local perturbations and thus does not consider the global structure of the dataset, which may also be useful for downstream tasks. In this paper, we study graph contrastive learning designed specifically for the biomedical domain, where molecular graphs are present. We propose a novel framework called MoCL, which utilizes domain knowledge at both local- and global-level to assist representation learning. The local-level domain knowledge guides the augmentation process such that variation is introduced without changing graph semantics. The global-level knowledge encodes the similarity information between graphs in the entire dataset and helps to learn representations with richer semantics. The entire model is learned through a double contrast objective. We evaluate MoCL on various molecular datasets under both linear and semi-supervised settings and results show that MoCL achieves state-of-the-art performance.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2021 ","pages":"3585-3594"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9105980/pdf/nihms-1798075.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10249436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated Adversarial Debiasing for Fair and Transferable Representations. 为公平和可转移的表述而进行的联合对抗性消除。
KDD : proceedings. International Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-01 Epub Date: 2021-08-14 DOI: 10.1145/3447548.3467281
Junyuan Hong, Zhuangdi Zhu, Shuyang Yu, Zhangyang Wang, Hiroko Dodge, Jiayu Zhou
{"title":"Federated Adversarial Debiasing for Fair and Transferable Representations.","authors":"Junyuan Hong, Zhuangdi Zhu, Shuyang Yu, Zhangyang Wang, Hiroko Dodge, Jiayu Zhou","doi":"10.1145/3447548.3467281","DOIUrl":"10.1145/3447548.3467281","url":null,"abstract":"<p><p>Federated learning is a distributed learning framework that is communication efficient and provides protection over participating users' raw training data. One outstanding challenge of federate learning comes from the users' heterogeneity, and learning from such data may yield biased and unfair models for minority groups. While adversarial learning is commonly used in centralized learning for mitigating bias, there are significant barriers when extending it to the federated framework. In this work, we study these barriers and address them by proposing a novel approach Federated Adversarial DEbiasing (FADE). FADE does not require users' sensitive group information for debiasing and offers users the freedom to opt-out from the adversarial component when privacy or computational costs become a concern. We show that ideally, FADE can attain the same global optimality as the one by the centralized algorithm. We then analyze when its convergence may fail in practice and propose a simple yet effective method to address the problem. Finally, we demonstrate the effectiveness of the proposed framework through extensive empirical studies, including the problem settings of unsupervised domain adaptation and fair learning. Our codes and pre-trained models are available at: https://github.com/illidanlab/FADE.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2021 ","pages":"617-627"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9105979/pdf/nihms-1798074.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10249439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values. 具有缺失值的时间二元数据的LogPar: Logistic PARAFAC2分解。
Kejing Yin, Ardavan Afshar, Joyce C Ho, William K Cheung, Chao Zhang, Jimeng Sun
{"title":"LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values.","authors":"Kejing Yin,&nbsp;Ardavan Afshar,&nbsp;Joyce C Ho,&nbsp;William K Cheung,&nbsp;Chao Zhang,&nbsp;Jimeng Sun","doi":"10.1145/3394486.3403213","DOIUrl":"https://doi.org/10.1145/3394486.3403213","url":null,"abstract":"<p><p>Binary data with one-class missing values are ubiquitous in real-world applications. They can be represented by irregular tensors with varying sizes in one dimension, where value one means presence of a feature while zero means unknown (i.e., either presence or absence of a feature). Learning accurate low-rank approximations from such binary irregular tensors is a challenging task. However, none of the existing models developed for factorizing irregular tensors take the missing values into account, and they assume Gaussian distributions, resulting in a distribution mismatch when applied to binary data. In this paper, we propose Logistic PARAFAC2 (LogPar) by modeling the binary irregular tensor with Bernoulli distribution parameterized by an underlying real-valued tensor. Then we approximate the underlying tensor with a positive-unlabeled learning loss function to account for the missing values. We also incorporate uniqueness and temporal smoothness regularization to enhance the interpretability. Extensive experiments using large-scale real-world datasets show that LogPar outperforms all baselines in both irregular tensor completion and downstream predictive tasks. For the irregular tensor completion, LogPar achieves up to 26% relative improvement compared to the best baseline. Besides, LogPar obtains relative improvement of 13.2% for heart failure prediction and 14% for mortality prediction on average compared to the state-of-the-art PARAFAC2 models.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2020 ","pages":"1625-1635"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3394486.3403213","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39079152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
MetaPred: Meta-Learning for Clinical Risk Prediction with Limited Patient Electronic Health Records. meta - apred:基于有限患者电子健康记录的临床风险预测元学习。
Xi Sheryl Zhang, Fengyi Tang, Hiroko H Dodge, Jiayu Zhou, Fei Wang
{"title":"MetaPred: Meta-Learning for Clinical Risk Prediction with Limited Patient Electronic Health Records.","authors":"Xi Sheryl Zhang,&nbsp;Fengyi Tang,&nbsp;Hiroko H Dodge,&nbsp;Jiayu Zhou,&nbsp;Fei Wang","doi":"10.1145/3292500.3330779","DOIUrl":"https://doi.org/10.1145/3292500.3330779","url":null,"abstract":"<p><p>In recent years, large amounts of health data, such as patient Electronic Health Records (EHR), are becoming readily available. This provides an unprecedented opportunity for knowledge discovery and data mining algorithms to dig insights from them, which can, later on, be helpful to the improvement of the quality of care delivery. Predictive modeling of clinical risks, including in-hospital mortality, hospital readmission, chronic disease onset, condition exacerbation, etc., from patient EHR, is one of the health data analytic problems that attract lots of the interests. The reason is not only because the problem is important in clinical settings, but also is challenging when working with EHR such as sparsity, irregularity, temporality, etc. Different from applications in other domains such as computer vision and natural language processing, the data samples in medicine (patients) are relatively limited, which creates lots of troubles for building effective predictive models, especially for complicated ones such as deep learning. In this paper, we propose MetaPred, a meta-learning framework for clinical risk prediction from longitudinal patient EHR. In particular, in order to predict the target risk with limited data samples, we train a meta-learner from a set of related risk prediction tasks which learns how a good predictor is trained. The meta-learned can then be directly used in target risk prediction, and the limited available samples in the target domain can be used for further fine-tuning the model performance. The effectiveness of MetaPred is tested on a real patient EHR repository from Oregon Health & Science University. We are able to demonstrate that with Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) as base predictors, MetaPred can achieve much better performance for predicting target risk with low resources comparing with the predictor trained on the limited samples available for this risk alone.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":"2019 ","pages":"2487-2495"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3292500.3330779","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38879115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 87
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信