ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine最新文献

Causality-based Subject and Task Fingerprints using fMRI Time-series Data. 使用fMRI时间序列数据的基于因果关系的主题和任务指纹。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2024-12-01 Epub Date: 2024-12-16 DOI: 10.1145/3698587.3701342

Dachuan Song, Li Shen, Duy Duong-Tran, Xuan Wang

{"title":"Causality-based Subject and Task Fingerprints using fMRI Time-series Data.","authors":"Dachuan Song, Li Shen, Duy Duong-Tran, Xuan Wang","doi":"10.1145/3698587.3701342","DOIUrl":"10.1145/3698587.3701342","url":null,"abstract":"Recently, there has been a revived interest in system neuroscience causation models due to their unique capability to unravel complex relationships in multi-scale brain networks. In this paper, our goal is to verify the feasibility and effectiveness of using a causality-based approach for fMRI fingerprinting. Specifically, we propose an innovative method that utilizes the causal dynamics activities of the brain to identify the unique cognitive patterns of individuals (e.g., subject fingerprint) and fMRI tasks (e.g., task fingerprint). The key novelty of our approach stems from the development of a two-timescale linear state-space model to extract 'spatio-temporal' (aka causal) signatures from an individual's fMRI time series data. To the best of our knowledge, we pioneer and subsequently quantify, in this paper, the concept of 'causal fingerprint.' Our method is well-separated from other fingerprint studies as we quantify fingerprints from a cause-and-effect perspective, which are then incorporated with a modal decomposition and projection method to perform subject identification and a GNN-based (Graph Neural Network) model to perform task identification. Finally, we show that the experimental results and comparisons with non-causality-based methods demonstrate the effectiveness of the proposed methods. We visualize the obtained causal signatures and discuss their biological relevance in light of the existing understanding of brain functionalities. Collectively, our work paves the way for further studies on causal fingerprints with potential applications in both healthy controls and neurodegenerative diseases.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2024 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11786950/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143082225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

New Spatial Phenotypes from Imaging Uncover Survival Differences for Breast Cancer Patients. 新的空间表型成像揭示乳腺癌患者的生存差异。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2024-11-01 Epub Date: 2024-12-16 DOI: 10.1145/3698587.3701333

Mahmudul Hasan, Ariadna Kim Silva, Shahira Abousamra, Shao-Jun Tang, Prateek Prasanna, Joel Saltz, Kevin Gardner, Chao Chen, Alisa Yurovsky

{"title":"New Spatial Phenotypes from Imaging Uncover Survival Differences for Breast Cancer Patients.","authors":"Mahmudul Hasan, Ariadna Kim Silva, Shahira Abousamra, Shao-Jun Tang, Prateek Prasanna, Joel Saltz, Kevin Gardner, Chao Chen, Alisa Yurovsky","doi":"10.1145/3698587.3701333","DOIUrl":"10.1145/3698587.3701333","url":null,"abstract":"Imaging technologies have revolutionized the study of the tumor microenvironment (TME) by leveraging spatial analysis, which enables the exploration of tissue organization and cellular communication, as well as aiding cancer diagnosis and prognosis. However, while many advanced spatial analysis methods have been recently published, they are enmeshed with specific imaging technology. An opportunity exists to develop a technology-agnostic methodology that captures complex spatial patterns in the TME as phenotypes to use in downstream tasks. In this paper, we present a novel variation of spatial g-function and a comprehensive imaging-technology-agnostic framework that identifies rich spatial phenotypes that can be used in survival analysis and classification tasks. Applying our methodology to breast cancer, we uncover spatial phenotypes with significance to survival across racial groups and molecular subtypes of breast cancer. We find other phenotypes that are significant to the survival of specific patient categories (such as African American). We also demonstrate that our phenotypes reflect specific biological contexts. These results highlight the relevance of our proposed spatial analysis and phenotype discovery pipeline and demonstrate the benefits of the systematic exploration of spatial phenotypes for more personalized diagnosis and treatments.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2024 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12228512/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144577100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CAPTURE: A Clustered Adaptive Patchwork Technique for Unified Registration Enhancement in Biological Imaging. 捕获：生物成像中统一配准增强的聚类自适应拼接技术。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2024-01-01 Epub Date: 2024-12-16 DOI: 10.1145/3698587.3701369

Sahand Hamzehei, Gianna Raimondi, Mostafa Karami, Linnaea Ostroff, Sheida Nabavi

{"title":"CAPTURE: A Clustered Adaptive Patchwork Technique for Unified Registration Enhancement in Biological Imaging.","authors":"Sahand Hamzehei, Gianna Raimondi, Mostafa Karami, Linnaea Ostroff, Sheida Nabavi","doi":"10.1145/3698587.3701369","DOIUrl":"10.1145/3698587.3701369","url":null,"abstract":"Image registration is important in biological image analysis; however, it is often challenged by distortions and non-linear transformations. In this paper, we present a novel patch-wise image registration method to address the mentioned issues. Our method begins with global registration to correct linear transformations, followed by a detailed examination of geometrical distortions. After that, each image is adaptively divided into patches to isolate and correct non-linear distortions, followed by reconstruction and combining patches using Otsu thresholding. We evaluated our method against state-of-the-art techniques using mutual information (MI), phase congruency-based (PCB), and gradient-based metrics (GBM) across four real biology datasets. Our results demonstrate superior feature alignment and image coherence, especially in serial-stack registrations. While the proposed method has longer processing times compared to linear registration methods, its enhanced accuracy and reliability to handle non-uniform distortion makes it beneficial for precision-demanding applications. We have created a public GitHub repository containing the code used in our research, available at https://github.com/NabaviLab/CAPTURE.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2024 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12123223/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144200953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Group Tensor Canonical Correlation Analysis. 多群张量典型相关分析。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2023-09-01 Epub Date: 2023-10-04 DOI: 10.1145/3584371.3612962

Zhuoping Zhou, Boning Tong, Davoud Ataee Tarzanagh, Bojian Hou, Andrew J Saykin, Qi Long, Li Shen

{"title":"Multi-Group Tensor Canonical Correlation Analysis.","authors":"Zhuoping Zhou, Boning Tong, Davoud Ataee Tarzanagh, Bojian Hou, Andrew J Saykin, Qi Long, Li Shen","doi":"10.1145/3584371.3612962","DOIUrl":"10.1145/3584371.3612962","url":null,"abstract":"Tensor Canonical Correlation Analysis (TCCA) is a commonly employed statistical method utilized to examine linear associations between two sets of tensor datasets. However, the existing TCCA models fail to adequately address the heterogeneity present in real-world tensor data, such as brain imaging data collected from diverse groups characterized by factors like sex and race. Consequently, these models may yield biased outcomes. In order to surmount this constraint, we propose a novel approach called Multi-Group TCCA (MG-TCCA), which enables the joint analysis of multiple subgroups. By incorporating a dual sparsity structure and a block coordinate ascent algorithm, our MG-TCCA method effectively addresses heterogeneity and leverages information across different groups to identify consistent signals. This novel approach facilitates the quantification of shared and individual structures, reduces data dimensionality, and enables visual exploration. To empirically validate our approach, we conduct a study focused on investigating correlations between two brain positron emission tomography (PET) modalities (AV-45 and FDG) within an Alzheimer's disease (AD) cohort. Our results demonstrate that MG-TCCA surpasses traditional TCCA in identifying sex-specific cross-modality imaging correlations. This heightened performance of MG-TCCA provides valuable insights for the characterization of multimodal imaging biomarkers in AD.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10593155/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50159453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Supervised Pretraining through Contrastive Categorical Positive Samplings to Improve COVID-19 Mortality Prediction. 通过对比分类阳性样本进行监督预训练以提高COVID-19死亡率预测。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2022-08-01 Epub Date: 2022-08-07 DOI: 10.1145/3535508.3545541

Tingyi Wanyan, Mingquan Lin, Eyal Klang, Kartikeya M Menon, Faris F Gulamali, Ariful Azad, Yiye Zhang, Ying Ding, Zhangyang Wang, Fei Wang, Benjamin Glicksberg, Yifan Peng

{"title":"Supervised Pretraining through Contrastive Categorical Positive Samplings to Improve COVID-19 Mortality Prediction.","authors":"Tingyi Wanyan, Mingquan Lin, Eyal Klang, Kartikeya M Menon, Faris F Gulamali, Ariful Azad, Yiye Zhang, Ying Ding, Zhangyang Wang, Fei Wang, Benjamin Glicksberg, Yifan Peng","doi":"10.1145/3535508.3545541","DOIUrl":"https://doi.org/10.1145/3535508.3545541","url":null,"abstract":"Clinical EHR data is naturally heterogeneous, where it contains abundant sub-phenotype. Such diversity creates challenges for outcome prediction using a machine learning model since it leads to high intra-class variance. To address this issue, we propose a supervised pre-training model with a unique embedded k-nearest-neighbor positive sampling strategy. We demonstrate the enhanced performance value of this framework theoretically and show that it yields highly competitive experimental results in predicting patient mortality in real-world COVID-19 EHR data with a total of over 7,000 patients admitted to a large, urban health system. Our method achieves a better AUROC prediction score of 0.872, which outperforms the alternative pre-training models and traditional machine learning methods. Additionally, our method performs much better when the training data size is small (345 training instances).","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2022 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9365529/pdf/nihms-1827823.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40609301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Segmenting Thoracic Cavities with Neoplastic Lesions: A Head-to-head Benchmark with Fully Convolutional Neural Networks. 胸腔肿瘤病灶分割:全卷积神经网络的头对头基准。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2021-08-01 DOI: 10.1145/3459930.3469564

Zhao Li, Rongbin Li, Kendall J Kiser, Luca Giancardo, W Jim Zheng

{"title":"Segmenting Thoracic Cavities with Neoplastic Lesions: A Head-to-head Benchmark with Fully Convolutional Neural Networks.","authors":"Zhao Li, Rongbin Li, Kendall J Kiser, Luca Giancardo, W Jim Zheng","doi":"10.1145/3459930.3469564","DOIUrl":"https://doi.org/10.1145/3459930.3469564","url":null,"abstract":"Automatic segmentation of thoracic cavity structures in computer tomography (CT) is a key step for applications ranging from radiotherapy planning to imaging biomarker discovery with radiomics approaches. State-of-the-art segmentation can be provided by fully convolutional neural networks such as the U-Net or V-Net. However, there is a very limited body of work on a comparative analysis of the performance of these architectures for chest CTs with significant neoplastic disease. In this work, we compared four different types of fully convolutional architectures using the same pre-processing and post-processing pipelines. These methods were evaluated using a dataset of CT images and thoracic cavity segmentations from 402 cancer patients. We found that these methods achieved very high segmentation performance by benchmarks of three evaluation criteria, i.e. Dice coefficient, average symmetric surface distance and 95% Hausdorff distance. Overall, the two-stage 3D U-Net model performed slightly better than other models, with Dice coefficients for left and right lung reaching 0.947 and 0.952, respectively. However, 3D U-Net model achieved the best performance under the evaluation of HD95 for right lung and ASSD for both left and right lung. These results demonstrate that the current state-of-art deep learning models can work very well for segmenting not only healthy lungs but also the lung containing different stages of cancerous lesions. The comprehensive types of lung masks from these evaluated methods enabled the creation of imaging-based biomarkers representing both healthy lung parenchyma and neoplastic lesions, allowing us to utilize these segmented areas for the downstream analysis, e.g. treatment planning, prognosis and survival prediction.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3459930.3469564","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40323973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Assigning ICD-O-3 Codes to Pathology Reports using Neural Multi-Task Training with Hierarchical Regularization. 使用具有层次规则化的神经多任务训练将ICD-O-3代码分配给病理学报告。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2021-08-01 DOI: 10.1145/3459930.3469541

Anthony Rios, Eric B Durbin, Isaac Hands, Ramakanth Kavuluru

{"title":"Assigning ICD-O-3 Codes to Pathology Reports using Neural Multi-Task Training with Hierarchical Regularization.","authors":"Anthony Rios, Eric B Durbin, Isaac Hands, Ramakanth Kavuluru","doi":"10.1145/3459930.3469541","DOIUrl":"10.1145/3459930.3469541","url":null,"abstract":"Tracking population-level cancer information is essential for researchers, clinicians, policymakers, and the public. Unfortunately, much of the information is stored as unstructured data in pathology reports. Thus, too process the information, we require either automated extraction techniques or manual curation. Moreover, many of the cancer-related concepts appear infrequently in real-world training datasets. Automated extraction is difficult because of the limited data. This study introduces a novel technique that incorporates structured expert knowledge to improve histology and topography code classification models. Using pathology reports collected from the Kentucky Cancer Registry, we introduce a novel multi-task training approach with hierarchical regularization that incorporates structured information about the International Classification of Diseases for Oncology, 3rd Edition classes to improve predictive performance. Overall, we find that our method improves both micro and macro F1. For macro F1, we achieve up to a 6% absolute improvement for topography codes and up to 4% absolute improvement for histology codes.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2021 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3459930.3469541","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39453028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Fast and memory-efficient scRNA-seq k-means clustering with various distances. 快速和高效的scRNA-seq - k-means聚类与不同的距离。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2021-08-01 DOI: 10.1145/3459930.3469523

Daniel N Baker, Nathan Dyjack, Vladimir Braverman, Stephanie C Hicks, Ben Langmead

{"title":"Fast and memory-efficient scRNA-seq k-means clustering with various distances.","authors":"Daniel N Baker, Nathan Dyjack, Vladimir Braverman, Stephanie C Hicks, Ben Langmead","doi":"10.1145/3459930.3469523","DOIUrl":"10.1145/3459930.3469523","url":null,"abstract":"Single-cell RNA-sequencing (scRNA-seq) analyses typically begin by clustering a gene-by-cell expression matrix to empirically define groups of cells with similar expression profiles. We describe new methods and a new open source library, minicore, for efficient k-means++ center finding and k-means clustering of scRNA-seq data. Minicore works with sparse count data, as it emerges from typical scRNA-seq experiments, as well as with dense data from after dimensionality reduction. Minicore's novel vectorized weighted reservoir sampling algorithm allows it to find initial k-means++ centers for a 4-million cell dataset in 1.5 minutes using 20 threads. Minicore can cluster using Euclidean distance, but also supports a wider class of measures like Jensen-Shannon Divergence, Kullback-Leibler Divergence, and the Bhattacharyya distance, which can be directly applied to count data and probability distributions. Further, minicore produces lower-cost centerings more efficiently than scikit-learn for scRNA-seq datasets with millions of cells. With careful handling of priors, minicore implements these distance measures with only minor (<2-fold) speed differences among all distances. We show that a minicore pipeline consisting of k-means++, localsearch++ and mini-batch k-means can cluster a 4-million cell dataset in minutes, using less than 10GiB of RAM. This memory-efficiency enables atlas-scale clustering on laptops and other commodity hardware. Finally, we report findings on which distance measures give clusterings that are most consistent with known cell type labels. Availability: The open source library is at https://github.com/dnbaker/minicore. Code used for experiments is at https://github.com/dnbaker/minicore-experiments.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2021 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8586878/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39733090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Concurrent Imputation and Prediction on EHR data using Bi-Directional GANs: Bi-GANs for EHR imputation and prediction. 使用双向 GANs 对电子病历数据进行同步估算和预测：用于电子病历估算和预测的双向 GANs。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2021-08-01 DOI: 10.1145/3459930.3469512

Mehak Gupta, H Timothy Bunnell, Thao-Ly T Phan, Rahmatollah Beheshti

{"title":"Concurrent Imputation and Prediction on EHR data using Bi-Directional GANs: Bi-GANs for EHR imputation and prediction.","authors":"Mehak Gupta, H Timothy Bunnell, Thao-Ly T Phan, Rahmatollah Beheshti","doi":"10.1145/3459930.3469512","DOIUrl":"10.1145/3459930.3469512","url":null,"abstract":"Working with electronic health records (EHRs) is known to be challenging due to several reasons. These reasons include not having: 1) similar lengths (per visit), 2) the same number of observations (per patient), and 3) complete entries in the available records. These issues hinder the performance of the predictive models created using EHRs. In this paper, we approach these issues by presenting a model for the combined task of imputing and predicting values for the irregularly observed and varying length EHR data with missing entries. Our proposed model (dubbed as Bi-GAN) uses a bidirectional recurrent network in a generative adversarial setting. In this architecture, the generator is a bidirectional recurrent network that receives the EHR data and imputes the existing missing values. The discriminator attempts to discriminate between the actual and the imputed values generated by the generator. Using the input data in its entirety, Bi-GAN learns how to impute missing elements in-between (imputation) or outside of the input time steps (prediction). Our method has three advantages to the state-of-the-art methods in the field: (a) one single model performs both the imputation and prediction tasks; (b) the model can perform predictions using time-series of varying length with missing data; (c) it does not require to know the observation and prediction time window during training and can be used for the predictions with different observation and prediction window lengths, for short- and long-term predictions. We evaluate our model on two large EHR datasets to impute and predict body mass index (BMI) values and show its superior performance in both settings.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2021 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8482531/pdf/nihms-1740754.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39483618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Joint Learning for Biomedical NER and Entity Normalization: Encoding Schemes, Counterfactual Examples, and Zero-Shot Evaluation. 生物医学NER和实体归一化的联合学习:编码方案，反事实示例和零射击评估。

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2021-08-01 DOI: 10.1145/3459930.3469533

Jiho Noh, Ramakanth Kavuluru

{"title":"Joint Learning for Biomedical NER and Entity Normalization: Encoding Schemes, Counterfactual Examples, and Zero-Shot Evaluation.","authors":"Jiho Noh, Ramakanth Kavuluru","doi":"10.1145/3459930.3469533","DOIUrl":"https://doi.org/10.1145/3459930.3469533","url":null,"abstract":"Named entity recognition (NER) and normalization (EN) form an indispensable first step to many biomedical natural language processing applications. In biomedical information science, recognizing entities (e.g., genes, diseases, or drugs) and normalizing them to concepts in standard terminologies or thesauri (e.g., Entrez, ICD-10, or RxNorm) is crucial for identifying more informative relations among them that drive disease etiology, progression, and treatment. In this effort we pursue two high level strategies to improve biomedical ER and EN. The first is to decouple standard entity encoding tags (e.g., \"B-Drug\" for the beginning of a drug) into type tags (e.g., \"Drug\") and positional tags (e.g., \"B\"). A second strategy is to use additional counterfactual training examples to handle the issue of models learning spurious correlations between surrounding context and normalized concepts in training data. We conduct elaborate experiments using the MedMentions dataset, the largest dataset of its kind for ER and EN in biomedicine. We find that our first strategy performs better in entity normalization when compared with the standard coding scheme. The second data augmentation strategy uniformly improves performance in span detection, typing, and normalization. The gains from counterfactual examples are more prominent when evaluating in zero-shot settings, for concepts that have never been encountered during training.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2021 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3459930.3469533","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39402820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5