Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics最新文献

筛选
英文 中文
Melanoma Risk Prediction with Structured Electronic Health Records 结构化电子健康记录的黑色素瘤风险预测
Aaron N. Richter, T. Khoshgoftaar
{"title":"Melanoma Risk Prediction with Structured Electronic Health Records","authors":"Aaron N. Richter, T. Khoshgoftaar","doi":"10.1145/3233547.3233561","DOIUrl":"https://doi.org/10.1145/3233547.3233561","url":null,"abstract":"Melanoma is one of the fastest growing cancers in the world, and can affect patients earlier in life than most other cancers. Therefore, it is imperative to be able to identify patients at high risk for melanoma and enroll them in screening programs to detect the cancer early. In this study, we explore data from dermatology outpatients to build a risk model for the disease. Using millions of patient records with thousands of data points in each record, we show that we can build a melanoma risk model from real-world Electronic Health Record (EHR) data without any expert knowledge or manually engineered features. While other risk models for melanoma have been developed, this is the first to use routinely collected EHR data rather than expert features targeted specifically for melanoma. The random forest model achieves similar or better performance than these previous models (AUC 0.79, sensitivity 0.71, specificity 0.72), which allows larger populations of patients to get screened for melanoma risk without having to perform specialized and time-consuming data collection. Important features from the model can be extracted and studied, and features influencing a specific prediction can be explained to providers and patients. The process for building this model can be further refined to improve performance, as well as used for risk prediction of other diseases.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127868241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
MIA: Multi-cohort Integrated Analysis for Biomarker Identification MIA:生物标志物鉴定的多队列综合分析
Brian Marks, Nina Hees, Hung Nguyen, Tin Nguyen
{"title":"MIA: Multi-cohort Integrated Analysis for Biomarker Identification","authors":"Brian Marks, Nina Hees, Hung Nguyen, Tin Nguyen","doi":"10.1145/3233547.3233605","DOIUrl":"https://doi.org/10.1145/3233547.3233605","url":null,"abstract":"Advanced high-throughput technologies have produced vast amounts of biological data. Data integration is the key to obtain the power needed to pinpoint the biological mechanisms and biomarkers of the underlying disease. Two critical drawbacks of computational approaches for data integration is that they do not account for study bias, as well as the noisy nature of molecular data. This leads to unreliable and inconsistent results, i.e., the results change drastically when the input is slightly perturbed or when additional datasets are added to the analysis. Here we propose a multi-cohort integrated approach, named MIA, for biomarker identification that is robust to noise and study bias. We deploy a leave-one-out strategy to avoid the disproportionate influence of a single cohort. We also utilize techniques from both p-value-based and effect-size-based meta-analyses to ensure that the identified genes are significantly impacted. We compare MIA versus classical approaches (Fisher's, Stouffer's, maxP, minP, and the additive method) using 7 microarray and 4 RNASeq datasets. For each approach, we construct a disease signature using 3 datasets and then classify patients from 8 remaining datasets. MIA outperforms all existing approaches in terms of both the highest sensitivity and specificity by accurately distinguishing symptomatic patients from healthy controls.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127347212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biomarker Discovery via Optimal Bayesian Feature Filtering for Structured Multiclass Data 基于最优贝叶斯特征过滤的结构化多类数据生物标志物发现
Ali Foroughi pour, Lori A. Dalton
{"title":"Biomarker Discovery via Optimal Bayesian Feature Filtering for Structured Multiclass Data","authors":"Ali Foroughi pour, Lori A. Dalton","doi":"10.1145/3233547.3233558","DOIUrl":"https://doi.org/10.1145/3233547.3233558","url":null,"abstract":"Biomarker discovery aims to find a shortlist of high-profile biomarkers that can be further verified and utilized in downstream analysis. Many biomarkers exhibit structured multiclass behavior, where groups of interest may be clustered into a small number of patterns such that groups assigned the same pattern share a common governing distribution. While several algorithms are proposed for multiclass problems, to the best of our knowledge, none can take such constraints on the group-pattern assignment, or structure, as input, and output high-profile potential biomarkers as well as the structure they satisfy. While post analyses may be used to infer the structure, ignoring such information impedes feature selection to fully take advantage of experimental data. Recent work proposes a Bayesian framework for feature selection that places priors on feature-label distribution and label-conditioned feature distribution. Here we extend this framework for structured multiclass problems, solve the proposed model for the case of independent features, evaluate it in several synthetic simulations, apply it to two cancer datasets, and perform enrichment analysis. Many of the highly ranked genes and pathways are suggested to be affected in the cancer under study. We also find potentially new biomarkers. Not only do we detect biomarkers, but also make inferences about the underlying distributional connections across classes, which provide additional insight on cancer biology.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131270049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Machine Learning Approach for Uncovering N6-methyladenosine-Disease Association 揭示n6 -甲基腺苷-疾病关联的机器学习方法
Song-Seon Zhang, Shaowu Zhang, Xiaonan Fan, Jia Meng, Yidong Chen, Yufei Huang
{"title":"A Machine Learning Approach for Uncovering N6-methyladenosine-Disease Association","authors":"Song-Seon Zhang, Shaowu Zhang, Xiaonan Fan, Jia Meng, Yidong Chen, Yufei Huang","doi":"10.1145/3233547.3233691","DOIUrl":"https://doi.org/10.1145/3233547.3233691","url":null,"abstract":"N6-methyladenosine (m6A) is a most abundant methylation, existing in >25% of human mRNAs. Exciting recent discoveries indicate close involvement of m6A in regulating many different aspects of mRNA metabolism and diseases like cancer. However, our current knowledge about how m6A levels are regulated and whether and how regulation of m6A levels of specific gene can play a role in cancer and other diseases is largely elusive. We propose in this paper a computational scheme for predicting m6A-regulated genes and -associated disease, which includes Deep-m6A, the first deep learning model for detecting conditionspecific m6A sites from MeRIP-Seq data with a single base resolution and new network-based pipeline that prioritizes functional significant m6A genes and its associated diseases using the Protein-Protein Interaction (PPI) and gene-disease heterogeneous networks. We applied Deep-m6A and this pipeline to 75 MeRIP-seq human samples, which produced a compact set of 499 functionally significant m6A-regulated genes and 6 functionally enriched subnetworks. The functional enrichment analysis of these genes and networks reveal that m6A targets key genes of many important biological processes including transcription, cell organization and transport, and cell proliferation and cancer related pathways such as Wnt pathway, Ras signaling, and PI3K-Akt signaling pathway. The m6Aassociated disease analysis prioritized 8 significantly associated diseases including leukemia and Alzheimer's disease. These results demonstrate the power of our proposed computational scheme and provide new leads for understanding m6A regulatory functions and its roles in diseases.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128160397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cost-Sensitive Deep Active Learning for Epileptic Seizure Detection 用于癫痫发作检测的成本敏感深度主动学习
Xuhui Chen, Jinlong Ji, Tianxi Ji, Pan Li
{"title":"Cost-Sensitive Deep Active Learning for Epileptic Seizure Detection","authors":"Xuhui Chen, Jinlong Ji, Tianxi Ji, Pan Li","doi":"10.1145/3233547.3233566","DOIUrl":"https://doi.org/10.1145/3233547.3233566","url":null,"abstract":"The analysis of electroencephalogram (EEG) signal plays a crucial role in epileptic seizure detection. Researchers have proposed many machine learning and deep learning based automatic epileptic seizure detection methods. However, these schemes, especially the deep learning based ones, suffer from labeling huge amounts of training data. Moreover, in epileptic seizure detection, physicians pay more attention to abnormal signals than normal signals, and thus the misclassification cost for them should be different. To address these issues, we propose a cost-sensitive deep active learning scheme to detect the epileptic seizure. In particular, we develop a new generic double-deep neural network (double-DNN) to obtain the cost-sensitive utility for the samples selection strategy in the labeling process. We further employ three types of fundamental neural networks, i.e., one-dimensional convolutional neural networks (1D CNNs), recurrent neural networks with long short-term memory (LSTM) units, and recurrent neural networks with gated recurrent units (GRU), in the double-DNN and evaluate their performances. Experiment results show that the proposed scheme can reduce the amount of labeled samples by up to 33% and 80% compared with uncertainty sampling and random sampling, respectively.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132083952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Sequence, Structure and Network Methods to Uncover Cancer Genes 揭示癌症基因的序列、结构和网络方法
Mona Singh
{"title":"Sequence, Structure and Network Methods to Uncover Cancer Genes","authors":"Mona Singh","doi":"10.1145/3233547.3233609","DOIUrl":"https://doi.org/10.1145/3233547.3233609","url":null,"abstract":"A major aim of cancer genomics is to pinpoint which somatically mutated genes are involved in tumor initiation and progression. This is a difficult task, as numerous somatic mutations are typically observed in each cancer genome, only a subset of which are cancer-relevant, and very few genes are found to be somatically mutated across large numbers of individuals. In this talk, I will overview three methods my group has introduced for identifying cancer genes. First, I will present a framework for uncovering cancer genes, differential mutation analysis, that compares the mutational profiles of genes across cancer genomes with their natural germline variation across healthy individuals. Next, I will show how to leverage per-individual mutational profiles within the context of protein-protein interaction networks in order to identify small connected subnetworks of genes that, while not individually frequently mutated, comprise pathways that are altered across (i.e., \"cover\") a large fraction of individuals. Finally, I will demonstrate that cancer genes can be discovered by identifying genes whose interaction interfaces are enriched in somatic mutations. Overall, these methods recapitulate known cancer driver genes, and discover novel, and sometimes rarely-mutated, genes with likely roles in cancer.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128997232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Services4SNPs: A RESTful Platform for Association Rule Mining and Survival Analysis of Genotyping Data services4snp:用于关联规则挖掘和基因分型数据生存分析的RESTful平台
Giuseppe Agapito, M. Cannataro
{"title":"Services4SNPs: A RESTful Platform for Association Rule Mining and Survival Analysis of Genotyping Data","authors":"Giuseppe Agapito, M. Cannataro","doi":"10.1145/3233547.3233626","DOIUrl":"https://doi.org/10.1145/3233547.3233626","url":null,"abstract":"The analysis of the relations among diseases and genetic aspects of individuals is based on the analysis of data produced by high-throughput experimental technologies, such as Single Nucleotide Polymorphism (SNP) genotyping data. We present a novel data analysis pipeline for SNP data, named Services4SNPs (S4S), that includes two previously developed data analysis tools, DMET-Miner and OSAnalyzer, that have been engineered and modified to be deployed as RESTful web services, named GenotypeAnalytics (GA) and OSAnalytics (OSA) respectively. S4S tries to overcome the limits of desktop bioinformatics software by moving complexity on the server-side, allowing users to easily extract multiple associations between SNPs in DMET datasets and correlate the presence-absence of SNPs with the overall survival of the subjects in DMET datasets annotated with clinical information.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122339229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Determination of Immunophenotypic Changes by CyTOF, Epigenetics and Component Resolved Diagnostics During Successful Desensitization in Multi-food Oral Immunotherapy 多种食物口服免疫治疗成功脱敏过程中免疫表型变化的细胞动力学、表观遗传学和成分分辨诊断测定
Sandra Andorf, M. Manohar, R. Chinthrajah, Sheena Gupta, H. Maecker, S. Galli, K. Nadeau
{"title":"Determination of Immunophenotypic Changes by CyTOF, Epigenetics and Component Resolved Diagnostics During Successful Desensitization in Multi-food Oral Immunotherapy","authors":"Sandra Andorf, M. Manohar, R. Chinthrajah, Sheena Gupta, H. Maecker, S. Galli, K. Nadeau","doi":"10.1145/3233547.3233644","DOIUrl":"https://doi.org/10.1145/3233547.3233644","url":null,"abstract":"Participants (n=44, age 4-15 yrs) with double-blind, placebo-controlled food challenge proven food allergy to multiple foods, were administered omalizumab (anti-IgE, n=40) or placebo (n=4) for 16 weeks with oral immunotherapy (OIT) for 2-5 foods, starting 8 weeks after the beginning of omalizumab or placebo (clinical outcomes of this trial in citeANDORF2018 ). To better understand the immunophenotypical changes leading to successful desensitization, we interrogated changes in immune cell subtypes in PBMCs before and after successful OIT using mass cytometry (CyTOF) on unstimulated as well as PMA/Ionomycin stimulated samples. The first step in this analysis was an unsupervised clustering across the markers within the CyTOF panel used for cell type identification (lineage markers) of a pooled dataset of all cells of the samples of the two time points. This was done through FlowSOM citeVanGassen2015, using self-organizing maps followed by hierarchical consensus meta-clustering. The immune cell subtype of each cluster was determined based on the expression level of the lineage markers of the cells within that cluster. The median level of various functional markers within each cluster were individually determined for each sample. Subsequently we tested whether the median level for each functional marker in each cell type (cluster) was significantly different between baseline and post-OIT. Further mechanistic experiments included epigenetics (pyrosequencing of bisulfite treated genomic DNA purified from participant's PBMCs) and component resolved diagnostics (ThermoFisher). Our preliminary results indicated a significant decrease (FDR-adjusted P < 0.01) of CD28 and GPR15 levels in effector memory CD4+ T cells after successful OIT compared to baseline. A significant increase (FDR-adjusted P < 0.01) in IL-10 was detected in the Treg and gamma-delta T cell populations. Epigenetic data demonstrated hypermethylation of the -48 CpG site in the IL-4 promoter region post-OIT (FDR-adjusted P < 0.01). The IgG4/IgE ratio of antibodies to most of the whole foods in the participant's OIT and to the corresponding storage proteins showed a significant increase (FDR-adjusted P < 0.01) between baseline and post-OIT. Our data thus imply that T cell anergy induced through OIT might contribute to successful desensitization.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129143814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
State Diagrams for Automating Disease "Risk Pyramid" Data Collection and Tailored Clinical Decision Support 自动化疾病“风险金字塔”数据收集和量身定制的临床决策支持的状态图
D. Willett, A. Pandey, N. Ifejika, V. Kannan, J. Berry, M. Basit
{"title":"State Diagrams for Automating Disease \"Risk Pyramid\" Data Collection and Tailored Clinical Decision Support","authors":"D. Willett, A. Pandey, N. Ifejika, V. Kannan, J. Berry, M. Basit","doi":"10.1145/3233547.3233660","DOIUrl":"https://doi.org/10.1145/3233547.3233660","url":null,"abstract":"Transitioning to value-based care makes new demands on understanding and managing patient risk for a variety of adverse outcomes in multiple conditions. Optimizing use of finite healthcare resources then proves challenging, and would benefit from a data-driven approach. Modelling the \"risk triangle\" paradigm of disease management as a state diagram within the electronic health record helps bring clinical situational awareness and tailored decision support interventions to individual patients at the point-of-care, while automatically capturing new types of state duration and transition sequence data across the whole population. Such data can iteratively inform improving risk prediction models.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114256198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HarMinMax: Harmonizing Codon Usage to Replicate Local Host Translation HarMinMax:协调密码子使用以复制本地主机翻译
Gabriel Wright, A. Rodríguez, P. Clark, S. Emrich
{"title":"HarMinMax: Harmonizing Codon Usage to Replicate Local Host Translation","authors":"Gabriel Wright, A. Rodríguez, P. Clark, S. Emrich","doi":"10.1145/3233547.3233637","DOIUrl":"https://doi.org/10.1145/3233547.3233637","url":null,"abstract":"ACM Reference Format: Gabriel Wright, Anabel Rodriguez, Patricia Clark, and Scott Emrich. 2018. HarMinMax: Harmonizing Codon Usage to Replicate Local Host Translation. In ACM-BCB’18: 9th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, August 29-September 1, 2018, Washington, DC, USA. ACM, New York, NY, USA, 1 page. https: //doi.org/10.1145/3233547.3233637","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116047069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信