Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics最新文献

筛选
英文 中文
Low-dimensional genotype embeddings for predictive models 预测模型的低维基因型嵌入
Syed Fahad Sultan, Xingzhi Guo, S. Skiena
{"title":"Low-dimensional genotype embeddings for predictive models","authors":"Syed Fahad Sultan, Xingzhi Guo, S. Skiena","doi":"10.1145/3535508.3545507","DOIUrl":"https://doi.org/10.1145/3535508.3545507","url":null,"abstract":"We develop methods for constructing low-dimensional vector representations (embeddings) of large-scale genotyping data, capable of reducing genotypes of hundreds of thousands of SNPs to 100-dimensional embeddings that retain substantial predictive power for inferring medical phenotypes. We demonstrate that embedding-based models yield an average F-score of 0.605 on a test of ten phenoypes (including BMI prediction, genetic relatedness, and depression) versus 0.339 for baseline models. Genotype embeddings also hold promise for creating sharing data while preserving subject anonymity: we show that they retain substantial predictive power even after anonymization by adding Gaussian noise to each dimension.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131376455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
B-cell epitope prediction for antipeptide paratopes with the HAPTIC2/HEPTAD user toolkit (HUT) 利用HAPTIC2/HEPTAD用户工具包(HUT)预测抗肽旁位的b细胞表位
S. Caoili
{"title":"B-cell epitope prediction for antipeptide paratopes with the HAPTIC2/HEPTAD user toolkit (HUT)","authors":"S. Caoili","doi":"10.1145/3535508.3545101","DOIUrl":"https://doi.org/10.1145/3535508.3545101","url":null,"abstract":"B-cell epitope prediction for antipeptide paratopes is key to developing novel vaccines and immunodiagnostics. This entails estimating free-energy changes for paratope binding to variable-length disordered peptidic sequences as has been previously described for the Heuristic Affinity Prediction Tool for Immune Complexes (HAPTIC), which resolves said binding into processes of epitope compaction, collapse and contact by analogy to protein folding. However, HAPTIC analyzes antigen sequence data without excluding potentially problematic candidate epitopes (e.g., comprising inaccessible and/or conformationally rigid residues) while also neglecting the temperature dependence of polyproline II (PPII) helix propensity (for compaction), occurrence of epitope-backbone hydrogen bonding and impact of disulfide bond formation between epitope cysteine residues. The present work thus provides a more physically realistic revision of HAPTIC (HAPTIC2), the HAPTIC2-like Epitope Prediction Tool for Antigen with Disulfide (HEPTAD) and the HAPTIC2/HEPTAD Input Preprocessor (HIP), forming the HAPTIC2/HEPTAD User Toolkit (HUT). HIP facilitates tagging of residues (e.g., in hydrophobic blobs, ordered regions and glycosylation motifs) for exclusion from downstream analyses by HAPTIC2 and HEPTAD. HAPTIC2 enables temperature-dependent PPII helix propensity calculations while also regarding glycine and proline as polar residues that form hydrogen bonds with paratopes. HEPTAD analyzes antigen sequences that each contain two cysteine residues for which the impact of disulfide pairing is estimated as a correction to the free-energy penalty of compaction. All components of HUT (i.e., HIT, HAPTIC2 and HEPTAD) are freely accessible online (http://badong.freeshell.org/hut.htm).","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128010648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Graphs & networks 会话细节:图表和网络
Utkrisht Rajkumar
{"title":"Session details: Graphs & networks","authors":"Utkrisht Rajkumar","doi":"10.1145/3552477","DOIUrl":"https://doi.org/10.1145/3552477","url":null,"abstract":"","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114728505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EvoVGM
Amine M. Remita, Abdoulaye Baniré Diallo
{"title":"EvoVGM","authors":"Amine M. Remita, Abdoulaye Baniré Diallo","doi":"10.1145/3535508.3545563","DOIUrl":"https://doi.org/10.1145/3535508.3545563","url":null,"abstract":"Most evolutionary-oriented deep generative models do not explicitly consider the underlying evolutionary dynamics of biological sequences as it is performed within the Bayesian phylogenetic inference framework. In this study, we propose a method for a deep variational Bayesian generative model (EvoVGM) that jointly approximates the true posterior of local evolutionary parameters and generates sequence alignments. Moreover, it is instantiated and tuned for continuous-time Markov chain substitution models such as JC69, K80 and GTR. We train the model via a low-variance stochastic estimator and a gradient ascent algorithm. Here, we analyze the consistency and effectiveness of EvoVGM on synthetic sequence alignments simulated with several evolutionary scenarios and different sizes. Finally, we highlight the robustness of a fine-tuned EvoVGM model using a sequence alignment of gene S of coronaviruses.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123513416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incorporating antigen processing into CD4+ T cell epitope prediction with integer linear programming 将抗原加工纳入CD4+ T细胞表位预测的整数线性规划
Avik Bhattacharya, Molly C. Lyons, S. Landry, Ramgopal R. Mettu
{"title":"Incorporating antigen processing into CD4+ T cell epitope prediction with integer linear programming","authors":"Avik Bhattacharya, Molly C. Lyons, S. Landry, Ramgopal R. Mettu","doi":"10.1145/3535508.3545545","DOIUrl":"https://doi.org/10.1145/3535508.3545545","url":null,"abstract":"CD4+ T-cell receptors recognize peptide-MHCII complexes displayed on the surface of antigen-presenting cells to induce an immune response. A fundamental problem in immunology is to characterize which peptides (i.e., epitopes) in an antigen induce such a response; this is the problem of computational epitope prediction. To be presented in the form of peptide-MHCII complex, peptides must satisfy two important criteria: they should be processed from an antigen to be available in the pool of peptides to which MHCII can bind and should have a sufficiently high binding affinity to MHCII molecules to form stable complexes. This latter phenomenon has been studied widely and used almost exclusively for epitope prediction. In prior work we have developed methods for modeling antigen processing and have shown that it has significant predictive power in predicting epitopes. In this paper, we propose an integer linear programming (ILP) approach to combine the contributions of antigen processing and peptide binding that provides a holistic and flexible framework for epitope prediction. We validate our results on data sets comprising of antigens associated with tumors and pathogens and show consistent enrichment and improvement in accuracy over other methods.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125125466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neighborhood embedding and re-ranking of disease genes with ADAGIO 基于ADAGIO的邻域嵌入与疾病基因重排序
Mert Erden, Megan Gelement, Sarrah Hakimjee, Kyla Levin, Mary-Joy Sidhom, K. Devkota, L. Cowen
{"title":"Neighborhood embedding and re-ranking of disease genes with ADAGIO","authors":"Mert Erden, Megan Gelement, Sarrah Hakimjee, Kyla Levin, Mary-Joy Sidhom, K. Devkota, L. Cowen","doi":"10.1145/3535508.3545542","DOIUrl":"https://doi.org/10.1145/3535508.3545542","url":null,"abstract":"We present ADAGIO, a new method for network-based disease gene prioritization that balances network interconnection structure with an embedding measure of network similarity. We show ADAGIO performs better than previous methods for recovering known disease genes in a recent benchmark set encompassing disease-associated genes for 22 polygenic diseases. We find ADAGIO discovers some interesting new disease gene candidates in both Alzheimer's and Parkinson's diseases. Code, ranked lists of disease genes, and supplementary figures and tables appear at https://github.com/merterden98/ADAGIO.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124055799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable deep learning for healthcare: methods and applications 用于医疗保健的可扩展深度学习:方法和应用
Luca Barillaro, Giuseppe Agapito, M. Cannataro
{"title":"Scalable deep learning for healthcare: methods and applications","authors":"Luca Barillaro, Giuseppe Agapito, M. Cannataro","doi":"10.1145/3535508.3545590","DOIUrl":"https://doi.org/10.1145/3535508.3545590","url":null,"abstract":"This paper provides an overview on scalable deep learning platforms and how they are used in medical context. An introduction highlights the key factors, then an overview on medical context is provided. Afterwards, the basic concepts about deep learning and parallel and distributed computing are briefly recalled. Then a specific deep learning library for medical applications is described. The last part of the paper is focused on a real use case application of deep learning on medical data. As a result, the main contribution of this paper is a short survey on main scalable deep learning platforms with a first analysis of their features, and the description of a practical example.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132661766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Offspring GAN augments biased human genomic data 后代GAN增加了有偏见的人类基因组数据
Supratim Das, Xinghua Shi
{"title":"Offspring GAN augments biased human genomic data","authors":"Supratim Das, Xinghua Shi","doi":"10.1145/3535508.3545537","DOIUrl":"https://doi.org/10.1145/3535508.3545537","url":null,"abstract":"Genomic data have been used for trait association and disease risk prediction for a long time. In recent years, many such prediction models are built using machine learning (ML) algorithms. As of today, human genomic data and other biomedical data suffer from sampling biases in terms of people's ethnicity, as most of the data come from people of European ancestry. Smaller sample sizes for other population groups can cause suboptimal results in ML-based prediction models for those populations. Suboptimal predictions in precision medicine for some particular group can cause serious consequences limiting the model's applicability in real-world problems. As data collection for those populations is time-consuming and costly, we suggest deep learning-based models for in-silico data enhancement. Existing Generative Adversarial Network (GAN) models for genomic data like Population scale Genomic conditional-GAN (PG-cGAN) can generate realistic genomic data while trained on fairly unbiased data but fails while trained on biased data and encounters severe mode collapse. Our proposed model, Offspring GAN, can resolve the mode collapse issue even when trained in strongly biased genomic datasets. Our results demonstrate the ability of Offspring GAN to generate realistic and diverse label-aware data, which can augment limited real data to alleviate biases and disparities in genomic data. We also propose a privacy-preserving protocol using Offspring GAN to protect the privacy of genomic data.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133761232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A novel three-step transcriptomic framework for cancer prediction 一种用于癌症预测的新型三步转录组学框架
Rushank Goyal
{"title":"A novel three-step transcriptomic framework for cancer prediction","authors":"Rushank Goyal","doi":"10.1145/3535508.3545098","DOIUrl":"https://doi.org/10.1145/3535508.3545098","url":null,"abstract":"Cancer is a broad term for diseases characterized by uncontrollable and abnormal cell growth. With 19.3 million new cases and 10 million cancer-related deaths per annum, it is the second-leading cause of death worldwide [4]. As a method of cancer detection, tools known as microarrays --- which develop a transcriptome, i.e. a rapid and systematic profile of the expression of a large number of genes at once --- are often used to identify cancerous cells [1]. However, prior research has utilized \"black-box\" algorithms, which are not appropriate for use in the life sciences [3]. In this study, a novel three-step framework was developed that combines the principles of biostatistics with transparent machine learning to create mathematical equations that predict cancer diagnoses using gene expression levels. First, an XGBoost model is trained on the training set, and the features with nonzero feature importances are carried onto the next step, where only genes that show a statistically significant difference (α=0.05) between expression patterns in cancerous and non-cancerous samples are retained. Finally, a novel symbolic regression-based algorithm called the QLattice (short for 'Quantum Lattice') is trained on the remaining features for 10 epochs using the Akaike Information Criterion as its loss function [2]. Table 1: Performance and Identified Biomarkers by Cancer To evaluate its performance, the framework was trained and tested on three datasets containing transcriptome profiles from cancerous and non-cancerous tissue for three different cancer types --- acute myeloid leukemia (AML), non-small cell lung cancer (NSCLC), and clear cell renal cell carcinoma (ccRCC). Table 1 shows the accuracies attained for each type as well as the biomarkers used in the mathematical expression (which together serve as a predictive gene signature), where an asterisk indicates that the gene has not been associated with that cancer type in previous literature. It should be noted that only three or four genes' expression levels are used in each case, while prior work has tended to use hundreds [1].","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122989865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Ontologies & databases 会话详细信息:本体和数据库
Vinay Paj
{"title":"Session details: Ontologies & databases","authors":"Vinay Paj","doi":"10.1145/3552481","DOIUrl":"https://doi.org/10.1145/3552481","url":null,"abstract":"","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127862250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信