ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine最新文献

筛选
英文 中文
Transformer-Based Named Entity Recognition for Parsing Clinical Trial Eligibility Criteria. 基于变压器的命名实体识别分析临床试验资格标准。
Shubo Tian, Arslan Erdengasileng, Xi Yang, Yi Guo, Yonghui Wu, Jinfeng Zhang, Jiang Bian, Zhe He
{"title":"Transformer-Based Named Entity Recognition for Parsing Clinical Trial Eligibility Criteria.","authors":"Shubo Tian,&nbsp;Arslan Erdengasileng,&nbsp;Xi Yang,&nbsp;Yi Guo,&nbsp;Yonghui Wu,&nbsp;Jinfeng Zhang,&nbsp;Jiang Bian,&nbsp;Zhe He","doi":"10.1145/3459930.3469560","DOIUrl":"https://doi.org/10.1145/3459930.3469560","url":null,"abstract":"<p><p>The rapid adoption of electronic health records (EHRs) systems has made clinical data available in electronic format for research and for many downstream applications. Electronic screening of potentially eligible patients using these clinical databases for clinical trials is a critical need to improve trial recruitment efficiency. Nevertheless, manually translating free-text eligibility criteria into database queries is labor intensive and inefficient. To facilitate automated screening, free-text eligibility criteria must be structured and coded into a computable format using controlled vocabularies. Named entity recognition (NER) is thus an important first step. In this study, we evaluate 4 state-of-the-art transformer-based NER models on two publicly available annotated corpora of eligibility criteria released by Columbia University (i.e., the Chia data) and Facebook Research (i.e.the FRD data). Four transformer-based models (i.e., BERT, ALBERT, RoBERTa, and ELECTRA) pretrained with general English domain corpora vs. those pretrained with PubMed citations, clinical notes from the MIMIC-III dataset and eligibility criteria extracted from all the clinical trials on ClinicalTrials.gov were compared. Experimental results show that RoBERTa pretrained with MIMIC-III clinical notes and eligibility criteria yielded the highest strict and relaxed F-scores in both the Chia data (i.e., 0.658/0.798) and the FRD data (i.e., 0.785/0.916). With promising NER results, further investigations on building a reliable natural language processing (NLP)-assisted pipeline for automated electronic screening are needed.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2021 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3459930.3469560","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39328500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
KGDAL: Knowledge Graph Guided Double Attention LSTM for Rolling Mortality Prediction for AKI-D Patients. KGDAL:知识图谱引导双注意LSTM用于AKI-D患者滚动死亡率预测。
Lucas Jing Liu, Victor Ortiz-Soriano, Javier A Neyra, Jin Chen
{"title":"KGDAL: Knowledge Graph Guided Double Attention LSTM for Rolling Mortality Prediction for AKI-D Patients.","authors":"Lucas Jing Liu, Victor Ortiz-Soriano, Javier A Neyra, Jin Chen","doi":"10.1145/3459930.3469513","DOIUrl":"10.1145/3459930.3469513","url":null,"abstract":"<p><p>With the rapid accumulation of electronic health record (EHR) data, deep learning (DL) models have exhibited promising performance on patient risk prediction. Recent advances have also demonstrated the effectiveness of knowledge graphs (KG) in providing valuable prior knowledge for further improving DL model performance. However, it is still unclear how KG can be utilized to encode high-order relations among clinical concepts and how DL models can make full use of the encoded concept relations to solve real-world healthcare problems and to interpret the outcomes. We propose a novel knowledge graph guided double attention LSTM model named KGDAL for rolling mortality prediction for critically ill patients with acute kidney injury requiring dialysis (AKI-D). KGDAL constructs a KG-based two-dimension attention in both time and feature spaces. In the experiment with two large healthcare datasets, we compared KGDAL with a variety of rolling mortality prediction models and conducted an ablation study to test the effectiveness, efficacy, and contribution of different attention mechanisms. The results showed that KGDAL clearly outperformed all the compared models. Also, KGDAL-derived patient risk trajectories may assist healthcare providers to make timely decisions and actions. The source code, sample data, and manual of KGDAL are available at https://github.com/lucasliu0928/KGDAL.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2021 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8445228/pdf/nihms-1737960.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39453029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised manifold alignment for single-cell multi-omics data. 单细胞多组学数据的无监督流形对齐。
Ritambhara Singh, Pinar Demetci, Giancarlo Bonora, Vijay Ramani, Choli Lee, He Fang, Zhijun Duan, Xinxian Deng, Jay Shendure, Christine Disteche, William Stafford Noble
{"title":"Unsupervised manifold alignment for single-cell multi-omics data.","authors":"Ritambhara Singh,&nbsp;Pinar Demetci,&nbsp;Giancarlo Bonora,&nbsp;Vijay Ramani,&nbsp;Choli Lee,&nbsp;He Fang,&nbsp;Zhijun Duan,&nbsp;Xinxian Deng,&nbsp;Jay Shendure,&nbsp;Christine Disteche,&nbsp;William Stafford Noble","doi":"10.1145/3388440.3412410","DOIUrl":"https://doi.org/10.1145/3388440.3412410","url":null,"abstract":"<p><p>Integrating single-cell measurements that capture different properties of the genome is vital to extending our understanding of genome biology. This task is challenging due to the lack of a shared axis across datasets obtained from different types of single-cell experiments. For most such datasets, we lack corresponding information among the cells (samples) and the measurements (features). In this scenario, unsupervised algorithms that are capable of aligning single-cell experiments are critical to learning an <i>in silico</i> co-assay that can help draw correspondences among the cells. Maximum mean discrepancy-based manifold alignment (MMD-MA) is such an unsupervised algorithm. Without requiring correspondence information, it can align single-cell datasets from different modalities in a common shared latent space, showing promising results on simulations and a small-scale single-cell experiment with 61 cells. However, it is essential to explore the applicability of this method to larger single-cell experiments with thousands of cells so that it can be of practical interest to the community. In this paper, we apply MMD-MA to two recent datasets that measure transcriptome and chromatin accessibility in ~2000 single cells. To scale the runtime of MMD-MA to a more substantial number of cells, we extend the original implementation to run on GPUs. We also introduce a method to automatically select one of the user-defined parameters, thus reducing the hyperparameter search space. We demonstrate that the proposed extensions allow MMD-MA to accurately align state-of-the-art single-cell experiments.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2020 ","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3388440.3412410","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10130200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
A deep learning fusion model for brain disorder classification: Application to distinguishing schizophrenia and autism spectrum disorder. 用于脑部疾病分类的深度学习融合模型:应用于区分精神分裂症和自闭症谱系障碍。
Yuhui Du, Bang Li, Yuliang Hou, Vince D Calhoun
{"title":"A deep learning fusion model for brain disorder classification: Application to distinguishing schizophrenia and autism spectrum disorder.","authors":"Yuhui Du, Bang Li, Yuliang Hou, Vince D Calhoun","doi":"10.1145/3388440.3412478","DOIUrl":"10.1145/3388440.3412478","url":null,"abstract":"<p><p>Deep learning has shown a great promise in classifying brain disorders due to its powerful ability in learning optimal features by nonlinear transformation. However, given the high-dimension property of neuroimaging data, how to jointly exploit complementary information from multimodal neuroimaging data in deep learning is difficult. In this paper, we propose a novel multilevel convolutional neural network (CNN) fusion method that can effectively combine different types of neuroimage-derived features. Importantly, we incorporate a sequential feature selection into the CNN model to increase the feature interpretability. To evaluate our method, we classified two symptom-related brain disorders using large-sample multi-site data from 335 schizophrenia (SZ) patients and 380 autism spectrum disorder (ASD) patients within a cross-validation procedure. Brain functional networks, functional network connectivity, and brain structural morphology were employed to provide possible features. As expected, our fusion method outperformed the CNN model using only single type of features, as our method yielded higher classification accuracy (with mean accuracy >85%) and was more reliable across multiple runs in differentiating the two groups. We found that the default mode, cognitive control, and subcortical regions contributed more in their distinction. Taken together, our method provides an effective means to fuse multimodal features for the diagnosis of different psychiatric and neurological disorders.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7758676/pdf/nihms-1654686.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38750792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combine Cryo-EM Density Map and Residue Contact for Protein Structure Prediction - A Case Study. 结合低温电镜密度图和残馀接触蛋白结构预测-一个案例研究。
Maytha Alshammari, Jing He
{"title":"Combine Cryo-EM Density Map and Residue Contact for Protein Structure Prediction - A Case Study.","authors":"Maytha Alshammari,&nbsp;Jing He","doi":"10.1145/3388440.3414708","DOIUrl":"https://doi.org/10.1145/3388440.3414708","url":null,"abstract":"<p><p>Cryo-electron microscopy is a major structure determination technique for large molecular machines and membrane-associated complexes. Although atomic structures have been determined directly from cryo-EM density maps with high resolutions, current structure determination methods for medium resolution (5 to 10 Å) cryo-EM maps are limited by the availability of structure templates. Secondary structure traces are lines detected from a cryo-EM density map for α-helices and β-strands of a protein. When combined with secondary structure sequence segments predicted from a protein sequence, it is possible to generate a set of likely topologies of α-traces and β-sheet traces. A topology describes the overall folding relationship among secondary structures; it is a critical piece of information for deriving the corresponding atomic structure. We propose a method for protein structure prediction that combines three sources of information: the secondary structure traces detected from the cryo-EM density map, predicted secondary structure sequence segments, and amino acid contact pairs predicted using MULTICOM. A case study shows that using amino acid contact prediction from MULTICOM improves the ranking of the true topology. Our observations convey that using a small set of highly voted secondary structure contact pairs enhances the ranking in all experiments conducted for this case.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3388440.3414708","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40524905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Using Curriculum Learning in Pattern Recognition of 3-dimensional Cryo-electron Microscopy Density Maps. 课程学习在三维冷冻电镜密度图模式识别中的应用。
Yangmei Deng, Yongcheng Mu, Salim Sazzed, Jiangwen Sun, Jing He
{"title":"Using Curriculum Learning in Pattern Recognition of 3-dimensional Cryo-electron Microscopy Density Maps.","authors":"Yangmei Deng,&nbsp;Yongcheng Mu,&nbsp;Salim Sazzed,&nbsp;Jiangwen Sun,&nbsp;Jing He","doi":"10.1145/3388440.3414710","DOIUrl":"https://doi.org/10.1145/3388440.3414710","url":null,"abstract":"<p><p>Although Cryo-electron microscopy (cryo-EM) has been successfully used to derive atomic structures for many proteins, it is still challenging to derive atomic structure when the resolution of cryo-EM density maps is in the medium range, e.g., 5-10 Å. Studies have attempted to utilize machine learning methods, especially deep neural networks to build predictive models for the detection of protein secondary structures from cryo-EM images, which ultimately helps to derive the atomic structure of proteins. However, the large variation in data quality makes it challenging to train a deep neural network with high prediction accuracy. Curriculum learning has been shown as an effective learning paradigm in machine learning. In this paper, we present a study using curriculum learning as a more effective way to utilize cryo-EM density maps with varying quality. We investigated three distinct training curricula that differ in whether/how images used for training in past are reused while the network was continually trained using new images. A total of 1,382 3-dimensional cryo-EM images were extracted from density maps of Electron Microscopy Data Bank in our study. Our results indicate learning with curriculum significantly improves the performance of the final trained network when the forgetting problem is properly addressed.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3388440.3414710","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40507888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Correlation Imputation in Single cell RNA-seq using Auxiliary Information and Ensemble Learning. 基于辅助信息和集成学习的单细胞RNA-seq相关归算。
Luqin Gan, Giuseppe Vinci, Genevera I Allen
{"title":"Correlation Imputation in Single cell RNA-seq using Auxiliary Information and Ensemble Learning.","authors":"Luqin Gan,&nbsp;Giuseppe Vinci,&nbsp;Genevera I Allen","doi":"10.1145/3388440.3412462","DOIUrl":"https://doi.org/10.1145/3388440.3412462","url":null,"abstract":"<p><p>Single cell RNA sequencing is a powerful technique that measures the gene expression of individual cells in a high throughput fashion. However, due to sequencing inefficiency, the data is unreliable due to dropout events, or technical artifacts where genes erroneously appear to have zero expression. Many data imputation methods have been proposed to alleviate this issue. Yet, effective imputation can be difficult and biased because the data is sparse and high-dimensional, resulting in major distortions in downstream analyses. In this paper, we propose a completely novel approach that imputes the gene-by-gene correlations rather than the data itself. We call this method SCENA: Single cell RNA-seq Correlation completion by ENsemble learning and Auxiliary information. The SCENA gene-by-gene correlation matrix estimate is obtained by model stacking of multiple imputed correlation matrices based on known auxiliary information about gene connections. In an extensive simulation study based on real scRNA-seq data, we demonstrate that SCENA not only accurately imputes gene correlations but also outperforms existing imputation approaches in downstream analyses such as dimension reduction, cell clustering, graphical model estimation.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3388440.3412462","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39197526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
SAU-Net: A Universal Deep Network for Cell Counting. SAU-Net:一个用于细胞计数的通用深度网络。
Yue Guo, Guorong Wu, Jason Stein, Ashok Krishnamurthy
{"title":"SAU-Net: A Universal Deep Network for Cell Counting.","authors":"Yue Guo,&nbsp;Guorong Wu,&nbsp;Jason Stein,&nbsp;Ashok Krishnamurthy","doi":"10.1145/3307339.3342153","DOIUrl":"10.1145/3307339.3342153","url":null,"abstract":"<p><p>Image-based cell counting is a fundamental yet challenging task with wide applications in biological research. In this paper, we propose a novel Deep Network designed to universally solve this problem for various cell types. Specifically, we first extend the segmentation network, U-Net with a Self-Attention module, named SAU-Net, for cell counting. Second, we design an online version of Batch Normalization to mitigate the generalization gap caused by data augmentation in small datasets. We evaluate the proposed method on four public cell counting benchmarks - synthetic fluorescence microscopy (VGG) dataset, Modified Bone Marrow (MBM) dataset, human subcutaneous adipose tissue (ADI) dataset, and Dublin Cell Counting (DCC) dataset. Our method surpasses the current state-of-the-art performance in the three real datasets (MBM, ADI and DCC) and achieves competitive results in the synthetic dataset (VGG). The source code is available at https://github.com/mzlr/sau-net.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2019 ","pages":"299-306"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3307339.3342153","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39027804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Integration of Heterogeneous Experimental Data Improves Global Map of Human Protein Complexes. 异质实验数据的整合改进了人类蛋白质复合物的全球图谱。
Jose Lugo-Martinez, Ziv Bar-Joseph, Jörn Dengjel, Robert F Murphy
{"title":"Integration of Heterogeneous Experimental Data Improves Global Map of Human Protein Complexes.","authors":"Jose Lugo-Martinez,&nbsp;Ziv Bar-Joseph,&nbsp;Jörn Dengjel,&nbsp;Robert F Murphy","doi":"10.1145/3307339.3342150","DOIUrl":"https://doi.org/10.1145/3307339.3342150","url":null,"abstract":"<p><p>Protein complexes play a significant role in the core functionality of cells. These complexes are typically identified by detecting densely connected subgraphs in protein-protein interaction (PPI) networks. Recently, multiple large-scale mass spectrometry-based experiments have significantly increased the availability of PPI data in order to further expand the set of known complexes. However, high-throughput experimental data generally are incomplete, show limited agreement between experiments, and show frequent false positive interactions. There is a need for computational approaches that can address these limitations in order to improve the coverage and accuracy of human protein complexes. Here, we present a new method that integrates data from multiple heterogeneous experiments and sources in order to increase the reliability and coverage of predicted protein complexes. We first fused the heterogeneous data into a feature matrix and trained classifiers to score pairwise protein interactions. We next used graph based methods to combine pairwise interactions into predicted protein complexes. Our approach improves the accuracy and coverage of protein pairwise interactions, accurately identifies known complexes, and suggests both novel additions to known complexes and entirely new complexes. Our results suggest that integration of heterogeneous experimental data helps improve the reliability and coverage of diverse high-throughput mass-spectrometry experiments, leading to an improved global map of human protein complexes.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2019 ","pages":"144-153"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3307339.3342150","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37979688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Copy Number Variation Detection Using Total Variation. 使用总变异检测拷贝数变异。
Fatima Zare, Sheida Nabavi
{"title":"Copy Number Variation Detection Using Total Variation.","authors":"Fatima Zare,&nbsp;Sheida Nabavi","doi":"10.1145/3307339.3342181","DOIUrl":"https://doi.org/10.1145/3307339.3342181","url":null,"abstract":"<p><p>Next-generation sequencing (NGS) technologies offer new opportunities for precise and accurate identification of genomic aberrations, including copy number variations (CNVs). For high-throughput NGS data, using depth of coverage has become a major approach to identify CNVs, especially for whole exome sequencing (WES) data. Due to the high level of noise and biases of read-count data and complexity of the WES data, existing CNV detection tools identify many false CNV segments. Besides, NGS generates a huge amount of data, requiring to use effective and efficient methods. In this work, we propose a novel segmentation algorithm based on the total variation approach to detect CNVs more precisely and efficiently using WES data. The proposed method also filters out outlier read-counts and identifies significant change points to reduce false positives. We used real and simulated data to evaluate the performance of the proposed method and compare its performance with those of other commonly used CNV detection methods. Using simulated and real data, we show that the proposed method outperforms the existing CNV detection methods in terms of accuracy and false discovery rate and has a faster runtime compared to the circular binary segmentation method.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2019 ","pages":"423-428"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3307339.3342181","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38028752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信