Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics最新文献_第5页

Melanoma Risk Prediction with Structured Electronic Health Records 结构化电子健康记录的黑色素瘤风险预测

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233561

Aaron N. Richter, T. Khoshgoftaar

{"title":"Melanoma Risk Prediction with Structured Electronic Health Records","authors":"Aaron N. Richter, T. Khoshgoftaar","doi":"10.1145/3233547.3233561","DOIUrl":"https://doi.org/10.1145/3233547.3233561","url":null,"abstract":"Melanoma is one of the fastest growing cancers in the world, and can affect patients earlier in life than most other cancers. Therefore, it is imperative to be able to identify patients at high risk for melanoma and enroll them in screening programs to detect the cancer early. In this study, we explore data from dermatology outpatients to build a risk model for the disease. Using millions of patient records with thousands of data points in each record, we show that we can build a melanoma risk model from real-world Electronic Health Record (EHR) data without any expert knowledge or manually engineered features. While other risk models for melanoma have been developed, this is the first to use routinely collected EHR data rather than expert features targeted specifically for melanoma. The random forest model achieves similar or better performance than these previous models (AUC 0.79, sensitivity 0.71, specificity 0.72), which allows larger populations of patients to get screened for melanoma risk without having to perform specialized and time-consuming data collection. Important features from the model can be extracted and studied, and features influencing a specific prediction can be explained to providers and patients. The process for building this model can be further refined to improve performance, as well as used for risk prediction of other diseases.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127868241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

MIA: Multi-cohort Integrated Analysis for Biomarker Identification MIA:生物标志物鉴定的多队列综合分析

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233605

Brian Marks, Nina Hees, Hung Nguyen, Tin Nguyen

{"title":"MIA: Multi-cohort Integrated Analysis for Biomarker Identification","authors":"Brian Marks, Nina Hees, Hung Nguyen, Tin Nguyen","doi":"10.1145/3233547.3233605","DOIUrl":"https://doi.org/10.1145/3233547.3233605","url":null,"abstract":"Advanced high-throughput technologies have produced vast amounts of biological data. Data integration is the key to obtain the power needed to pinpoint the biological mechanisms and biomarkers of the underlying disease. Two critical drawbacks of computational approaches for data integration is that they do not account for study bias, as well as the noisy nature of molecular data. This leads to unreliable and inconsistent results, i.e., the results change drastically when the input is slightly perturbed or when additional datasets are added to the analysis. Here we propose a multi-cohort integrated approach, named MIA, for biomarker identification that is robust to noise and study bias. We deploy a leave-one-out strategy to avoid the disproportionate influence of a single cohort. We also utilize techniques from both p-value-based and effect-size-based meta-analyses to ensure that the identified genes are significantly impacted. We compare MIA versus classical approaches (Fisher's, Stouffer's, maxP, minP, and the additive method) using 7 microarray and 4 RNASeq datasets. For each approach, we construct a disease signature using 3 datasets and then classify patients from 8 remaining datasets. MIA outperforms all existing approaches in terms of both the highest sensitivity and specificity by accurately distinguishing symptomatic patients from healthy controls.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127347212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Biomarker Discovery via Optimal Bayesian Feature Filtering for Structured Multiclass Data 基于最优贝叶斯特征过滤的结构化多类数据生物标志物发现

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233558

Ali Foroughi pour, Lori A. Dalton

{"title":"Biomarker Discovery via Optimal Bayesian Feature Filtering for Structured Multiclass Data","authors":"Ali Foroughi pour, Lori A. Dalton","doi":"10.1145/3233547.3233558","DOIUrl":"https://doi.org/10.1145/3233547.3233558","url":null,"abstract":"Biomarker discovery aims to find a shortlist of high-profile biomarkers that can be further verified and utilized in downstream analysis. Many biomarkers exhibit structured multiclass behavior, where groups of interest may be clustered into a small number of patterns such that groups assigned the same pattern share a common governing distribution. While several algorithms are proposed for multiclass problems, to the best of our knowledge, none can take such constraints on the group-pattern assignment, or structure, as input, and output high-profile potential biomarkers as well as the structure they satisfy. While post analyses may be used to infer the structure, ignoring such information impedes feature selection to fully take advantage of experimental data. Recent work proposes a Bayesian framework for feature selection that places priors on feature-label distribution and label-conditioned feature distribution. Here we extend this framework for structured multiclass problems, solve the proposed model for the case of independent features, evaluate it in several synthetic simulations, apply it to two cancer datasets, and perform enrichment analysis. Many of the highly ranked genes and pathways are suggested to be affected in the cancer under study. We also find potentially new biomarkers. Not only do we detect biomarkers, but also make inferences about the underlying distributional connections across classes, which provide additional insight on cancer biology.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131270049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Machine Learning Approach for Uncovering N6-methyladenosine-Disease Association 揭示n6 -甲基腺苷-疾病关联的机器学习方法

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233691

Song-Seon Zhang, Shaowu Zhang, Xiaonan Fan, Jia Meng, Yidong Chen, Yufei Huang

{"title":"A Machine Learning Approach for Uncovering N6-methyladenosine-Disease Association","authors":"Song-Seon Zhang, Shaowu Zhang, Xiaonan Fan, Jia Meng, Yidong Chen, Yufei Huang","doi":"10.1145/3233547.3233691","DOIUrl":"https://doi.org/10.1145/3233547.3233691","url":null,"abstract":"N6-methyladenosine (m6A) is a most abundant methylation, existing in >25% of human mRNAs. Exciting recent discoveries indicate close involvement of m6A in regulating many different aspects of mRNA metabolism and diseases like cancer. However, our current knowledge about how m6A levels are regulated and whether and how regulation of m6A levels of specific gene can play a role in cancer and other diseases is largely elusive. We propose in this paper a computational scheme for predicting m6A-regulated genes and -associated disease, which includes Deep-m6A, the first deep learning model for detecting conditionspecific m6A sites from MeRIP-Seq data with a single base resolution and new network-based pipeline that prioritizes functional significant m6A genes and its associated diseases using the Protein-Protein Interaction (PPI) and gene-disease heterogeneous networks. We applied Deep-m6A and this pipeline to 75 MeRIP-seq human samples, which produced a compact set of 499 functionally significant m6A-regulated genes and 6 functionally enriched subnetworks. The functional enrichment analysis of these genes and networks reveal that m6A targets key genes of many important biological processes including transcription, cell organization and transport, and cell proliferation and cancer related pathways such as Wnt pathway, Ras signaling, and PI3K-Akt signaling pathway. The m6Aassociated disease analysis prioritized 8 significantly associated diseases including leukemia and Alzheimer's disease. These results demonstrate the power of our proposed computational scheme and provide new leads for understanding m6A regulatory functions and its roles in diseases.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128160397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cost-Sensitive Deep Active Learning for Epileptic Seizure Detection 用于癫痫发作检测的成本敏感深度主动学习

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233566

Xuhui Chen, Jinlong Ji, Tianxi Ji, Pan Li

{"title":"Cost-Sensitive Deep Active Learning for Epileptic Seizure Detection","authors":"Xuhui Chen, Jinlong Ji, Tianxi Ji, Pan Li","doi":"10.1145/3233547.3233566","DOIUrl":"https://doi.org/10.1145/3233547.3233566","url":null,"abstract":"The analysis of electroencephalogram (EEG) signal plays a crucial role in epileptic seizure detection. Researchers have proposed many machine learning and deep learning based automatic epileptic seizure detection methods. However, these schemes, especially the deep learning based ones, suffer from labeling huge amounts of training data. Moreover, in epileptic seizure detection, physicians pay more attention to abnormal signals than normal signals, and thus the misclassification cost for them should be different. To address these issues, we propose a cost-sensitive deep active learning scheme to detect the epileptic seizure. In particular, we develop a new generic double-deep neural network (double-DNN) to obtain the cost-sensitive utility for the samples selection strategy in the labeling process. We further employ three types of fundamental neural networks, i.e., one-dimensional convolutional neural networks (1D CNNs), recurrent neural networks with long short-term memory (LSTM) units, and recurrent neural networks with gated recurrent units (GRU), in the double-DNN and evaluate their performances. Experiment results show that the proposed scheme can reduce the amount of labeled samples by up to 33% and 80% compared with uncertainty sampling and random sampling, respectively.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132083952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Sequence, Structure and Network Methods to Uncover Cancer Genes 揭示癌症基因的序列、结构和网络方法

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233609

Mona Singh

{"title":"Sequence, Structure and Network Methods to Uncover Cancer Genes","authors":"Mona Singh","doi":"10.1145/3233547.3233609","DOIUrl":"https://doi.org/10.1145/3233547.3233609","url":null,"abstract":"A major aim of cancer genomics is to pinpoint which somatically mutated genes are involved in tumor initiation and progression. This is a difficult task, as numerous somatic mutations are typically observed in each cancer genome, only a subset of which are cancer-relevant, and very few genes are found to be somatically mutated across large numbers of individuals. In this talk, I will overview three methods my group has introduced for identifying cancer genes. First, I will present a framework for uncovering cancer genes, differential mutation analysis, that compares the mutational profiles of genes across cancer genomes with their natural germline variation across healthy individuals. Next, I will show how to leverage per-individual mutational profiles within the context of protein-protein interaction networks in order to identify small connected subnetworks of genes that, while not individually frequently mutated, comprise pathways that are altered across (i.e., \"cover\") a large fraction of individuals. Finally, I will demonstrate that cancer genes can be discovered by identifying genes whose interaction interfaces are enriched in somatic mutations. Overall, these methods recapitulate known cancer driver genes, and discover novel, and sometimes rarely-mutated, genes with likely roles in cancer.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128997232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Services4SNPs: A RESTful Platform for Association Rule Mining and Survival Analysis of Genotyping Data services4snp:用于关联规则挖掘和基因分型数据生存分析的RESTful平台

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233626

Giuseppe Agapito, M. Cannataro

引用次数: 3

Determination of Immunophenotypic Changes by CyTOF, Epigenetics and Component Resolved Diagnostics During Successful Desensitization in Multi-food Oral Immunotherapy 多种食物口服免疫治疗成功脱敏过程中免疫表型变化的细胞动力学、表观遗传学和成分分辨诊断测定

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233644

Sandra Andorf, M. Manohar, R. Chinthrajah, Sheena Gupta, H. Maecker, S. Galli, K. Nadeau

{"title":"Determination of Immunophenotypic Changes by CyTOF, Epigenetics and Component Resolved Diagnostics During Successful Desensitization in Multi-food Oral Immunotherapy","authors":"Sandra Andorf, M. Manohar, R. Chinthrajah, Sheena Gupta, H. Maecker, S. Galli, K. Nadeau","doi":"10.1145/3233547.3233644","DOIUrl":"https://doi.org/10.1145/3233547.3233644","url":null,"abstract":"Participants (n=44, age 4-15 yrs) with double-blind, placebo-controlled food challenge proven food allergy to multiple foods, were administered omalizumab (anti-IgE, n=40) or placebo (n=4) for 16 weeks with oral immunotherapy (OIT) for 2-5 foods, starting 8 weeks after the beginning of omalizumab or placebo (clinical outcomes of this trial in citeANDORF2018 ). To better understand the immunophenotypical changes leading to successful desensitization, we interrogated changes in immune cell subtypes in PBMCs before and after successful OIT using mass cytometry (CyTOF) on unstimulated as well as PMA/Ionomycin stimulated samples. The first step in this analysis was an unsupervised clustering across the markers within the CyTOF panel used for cell type identification (lineage markers) of a pooled dataset of all cells of the samples of the two time points. This was done through FlowSOM citeVanGassen2015, using self-organizing maps followed by hierarchical consensus meta-clustering. The immune cell subtype of each cluster was determined based on the expression level of the lineage markers of the cells within that cluster. The median level of various functional markers within each cluster were individually determined for each sample. Subsequently we tested whether the median level for each functional marker in each cell type (cluster) was significantly different between baseline and post-OIT. Further mechanistic experiments included epigenetics (pyrosequencing of bisulfite treated genomic DNA purified from participant's PBMCs) and component resolved diagnostics (ThermoFisher). Our preliminary results indicated a significant decrease (FDR-adjusted P < 0.01) of CD28 and GPR15 levels in effector memory CD4+ T cells after successful OIT compared to baseline. A significant increase (FDR-adjusted P < 0.01) in IL-10 was detected in the Treg and gamma-delta T cell populations. Epigenetic data demonstrated hypermethylation of the -48 CpG site in the IL-4 promoter region post-OIT (FDR-adjusted P < 0.01). The IgG4/IgE ratio of antibodies to most of the whole foods in the participant's OIT and to the corresponding storage proteins showed a significant increase (FDR-adjusted P < 0.01) between baseline and post-OIT. Our data thus imply that T cell anergy induced through OIT might contribute to successful desensitization.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129143814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

State Diagrams for Automating Disease "Risk Pyramid" Data Collection and Tailored Clinical Decision Support 自动化疾病“风险金字塔”数据收集和量身定制的临床决策支持的状态图

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233660

D. Willett, A. Pandey, N. Ifejika, V. Kannan, J. Berry, M. Basit

引用次数: 0

HarMinMax: Harmonizing Codon Usage to Replicate Local Host Translation HarMinMax:协调密码子使用以复制本地主机翻译

Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics Pub Date : 2018-08-15 DOI: 10.1145/3233547.3233637

Gabriel Wright, A. Rodríguez, P. Clark, S. Emrich

引用次数: 1