ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine最新文献
Deukhyun Cha, Alexander Rand, Qin Zhang, Rezaul A Chowdhury, Jesmin Jahan Tithi, Chandrajit Bajaj
{"title":"Accelerated Molecular Mechanical and Solvation Energetics on Multicore CPUs and Manycore GPUs.","authors":"Deukhyun Cha, Alexander Rand, Qin Zhang, Rezaul A Chowdhury, Jesmin Jahan Tithi, Chandrajit Bajaj","doi":"10.1145/2808719.2808742","DOIUrl":"10.1145/2808719.2808742","url":null,"abstract":"<p><strong>Motivation: </strong>Despite several reported acceleration successes of programmable GPUs (Graphics Processing Units) for molecular modeling and simulation tools, the general focus has been on fast computation with small molecules. This was primarily due to the limited memory size on the GPU. Moreover simultaneous use of CPU and GPU cores for a single kernel execution - a necessity for achieving high parallelism - has also not been fully considered.</p><p><strong>Results: </strong>We present fast computation methods for molecular mechanical (Lennard-Jones and Coulombic) and generalized Born solvation energetics which run on commodity multicore CPUs and manycore GPUs. The key idea is to trade off accuracy of pairwise, long-range atomistic energetics for higher speed of execution. A simple yet efficient CUDA kernel for GPU acceleration is presented which ensures high arithmetic intensity and memory efficiency. Our CUDA kernel uses a cache-friendly, recursive and linear-space octree data structure to handle very large molecular structures with up to several million atoms. Based on this CUDA kernel, we present a hybrid method which simultaneously exploits both CPU and GPU cores to provide the best performance based on selected parameters of the approximation scheme. Our CUDA kernels achieve more than two orders of magnitude speedup over serial computation for many of the molecular energetics terms. The hybrid method is shown to be able to achieve the best performance for all values of the approximation parameter.</p><p><strong>Availability: </strong>The source code and binaries are freely available as <i>PMEOPA</i> (Parallel Molecular Energetic using Octree Pairwise Approximation) and downloadable from http://cvcweb.ices.utexas.edu/software.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7347088/pdf/nihms-1587186.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38137193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Personalized Clinical Risk Prediction Based on Causality-Based Association Rules.","authors":"Chih-Wen Cheng, May D Wang","doi":"10.1145/2808719.2808759","DOIUrl":"https://doi.org/10.1145/2808719.2808759","url":null,"abstract":"<p><p>Developing clinical risk prediction models is one of the main tasks of healthcare data mining. Advanced data collection techniques in current Big Data era have created an emerging and urgent need for scalable, computer-based data mining methods. These methods can turn data into useful, personalized decision support knowledge in a flexible, cost-effective, and productive way. In our previous study, we developed a tool, called icuARM- II, that can generate personalized clinical risk prediction evidence using a temporal rule mining framework. However, the generation of final risk prediction possibility with icuARM-II still relied on human interpretation, which was subjective and, most of time, biased. In this study, we propose a new mechanism to improve icuARM-II's rule selection by including the concept of causal analysis. The generated risk prediction is quantitatively assessed using calibration statistics. To evaluate the performance of the new rule selection mechanism, we conducted a case study to predict short-term intensive care unit mortality based on personalized lab testing abnormalities. Our results demonstrated a better-calibrated ICU risk prediction using the new causality-base rule selection solution by comparing with conventional confidence-only rule selection methods.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2808719.2808759","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34313366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Developing Robust Predictive Models for Head and Neck Cancer across Microarray and RNA-seq Data.","authors":"Chanchala D Kaddi, Wallace H Coulter, May D Wang","doi":"10.1145/2808719.2808760","DOIUrl":"10.1145/2808719.2808760","url":null,"abstract":"<p><p>Increased understanding of the transcriptomic patterns underlying head and neck squamous cell carcinoma (HNSCC) can facilitate earlier diagnosis and better treatment outcomes. Integrating knowledge from multiple studies is necessary to identify fundamental, consistent gene expression signatures that distinguish HNSCC patient samples from disease-free samples, and particularly for detecting HNSCC at an early pathological stage. This study utilizes feature integration and heterogeneous ensemble modeling techniques to develop robust models for predicting HNSCC disease status in both microarray and RNAseq datasets. Several alternative models demonstrated good performance, with MCC and AUC values exceeding 0.8. These models were also applied to discriminate between early pathological stage HNSCC and normal RNA-seq samples, showing encouraging results. The predictive modeling workflow was integrated into a software tool with a graphical user interface. This tool enables HNSCC researchers to harness frequently observed transcriptomic features and ensembles of previously developed models when investigating new HNSCC gene expression datasets.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859557/pdf/nihms806059.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35939492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Yang, Po-Yen Wu, Li Tong, John H Phan, May D Wang
{"title":"The impact of RNA-seq aligners on gene expression estimation.","authors":"Cheng Yang, Po-Yen Wu, Li Tong, John H Phan, May D Wang","doi":"10.1145/2808719.2808767","DOIUrl":"https://doi.org/10.1145/2808719.2808767","url":null,"abstract":"<p><p>While numerous RNA-seq data analysis pipelines are available, research has shown that the choice of pipeline influences the results of differentially expressed gene detection and gene expression estimation. Gene expression estimation is a key step in RNA-seq data analysis, since the accuracy of gene expression estimates profoundly affects the subsequent analysis. Generally, gene expression estimation involves sequence alignment and quantification, and accurate gene expression estimation requires accurate alignment. However, the impact of aligners on gene expression estimation remains unclear. We address this need by constructing nine pipelines consisting of nine spliced aligners and one quantifier. We then use simulated data to investigate the impact of aligners on gene expression estimation. To evaluate alignment, we introduce three alignment performance metrics, (1) the percentage of reads aligned, (2) the percentage of reads aligned with zero mismatch (ZeroMismatchPercentage), and (3) the percentage of reads aligned with at most one mismatch (ZeroOneMismatchPercentage). We then evaluate the impact of alignment performance on gene expression estimation using three metrics, (1) gene detection accuracy, (2) the number of genes falsely quantified (FalseExpNum), and (3) the number of genes with falsely estimated fold changes (FalseFcNum). We found that among various pipelines, FalseExpNum and FalseFcNum are correlated. Moreover, FalseExpNum is linearly correlated with the percentage of reads aligned and ZeroMismatchPercentage, and FalseFcNum is linearly correlated with ZeroMismatchPercentage. Because of this correlation, the percentage of reads aligned and ZeroMismatchPercentage may be used to assess the performance of gene expression estimation for all RNA-seq datasets.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2808719.2808767","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34711999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chromatin and Genomic determinants of alternative splicing.","authors":"Kun Wang, Kan Cao, Sridhar Hannenhalli","doi":"10.1145/2808719.2808755","DOIUrl":"https://doi.org/10.1145/2808719.2808755","url":null,"abstract":"<p><p>Alternative splicing significantly contributes to proteomic diversity and mis-regulation of splicing can cause diseases in human. Although both genomic and chromatin features have been shown to associate with splicing, the mechanisms by which various chromatin marks influence splicing is not clear for the most part. Moreover, it is not known whether the influence of specific genomic features on splicing is potentially modulated by the chromatin context. Here we report a deep neural network (DNN) model for predicting exon inclusion based on comprehensive genomic and chromatin features. Our analysis in three cell lines shows that, while both genomic and chromatin features can predict splicing to varying degrees, genomic features are the primary drivers of splicing, and the predictive power of chromatin features can largely be explained by their correlation with genomic features; chromatin features do not yield substantial independent contribution to splicing predictability. However, our model identified specific interactions between chromatin and genomic features suggesting that the effect of genomic elements may be modulated by chromatin context.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2808719.2808755","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35427235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"icuARM-II: improving the reliability of personalized risk prediction in pediatric intensive care units.","authors":"Chih-Wen Cheng, Nikhil Chanani, Kevin Maher, Wang","doi":"10.1145/2649387.2649440","DOIUrl":"10.1145/2649387.2649440","url":null,"abstract":"<p><p>Clinicians in intensive care units (ICUs) rely on standardized scores as risk prediction models to predict a patient's vulnerability to life-threatening events. Conventional Current scales calculate scores from a fixed set of conditions collected within a specific time window. However, modern monitoring technologies generate complex, temporal, and multimodal patient data that conventional prediction models scales cannot fully utilize. Thus, a more sophisticated model is needed to tailor individual characteristics and incorporate multiple temporal modalities for a personalized risk prediction. Furthermore, most scales models focus on adult patients. To address this needdeficiency, we propose a newly designed ICU risk prediction system, called icuARM-II, using a large-scaled pediatric ICU database from Children's Healthcare of Atlanta. This novel database contains clinical data collected in 5,739 ICU visits from 4,975 patients. We propose a temporal association rule mining framework giving clinicians a potential to perform predict risks prediction based on all available patient conditions without being restricted by a fixed observation window. We also develop a new metric that can rigidly assesses the reliability of all all generated association rules. In addition, the icuARM-II features an interactive user interface. Using the icuARM-II, our results demonstrated showed a use case of short-term mortality prediction using lab testing results, which demonstrated a potential new solution for reliable ICU risk prediction using personalized clinical data in a previously neglected population.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4983419/pdf/nihms805837.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34313365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"omniClassifier: a Desktop Grid Computing System for Big Data Prediction Modeling.","authors":"John H Phan, Sonal Kothari, May D Wang","doi":"10.1145/2649387.2649439","DOIUrl":"10.1145/2649387.2649439","url":null,"abstract":"<p><p>Robust prediction models are important for numerous science, engineering, and biomedical applications. However, best-practice procedures for optimizing prediction models can be computationally complex, especially when choosing models from among hundreds or thousands of parameter choices. Computational complexity has further increased with the growth of data in these fields, concurrent with the era of \"Big Data\". Grid computing is a potential solution to the computational challenges of Big Data. Desktop grid computing, which uses idle CPU cycles of commodity desktop machines, coupled with commercial cloud computing resources can enable research labs to gain easier and more cost effective access to vast computing resources. We have developed omniClassifier, a multi-purpose prediction modeling application that provides researchers with a tool for conducting machine learning research within the guidelines of recommended best-practices. omniClassifier is implemented as a desktop grid computing system using the Berkeley Open Infrastructure for Network Computing (BOINC) middleware. In addition to describing implementation details, we use various gene expression datasets to demonstrate the potential scalability of omniClassifier for efficient and robust Big Data prediction modeling. A prototype of omniClassifier can be accessed at http://omniclassifier.bme.gatech.edu/.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4983434/pdf/nihms805844.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9852973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mahbubur Rahman, Rummana Bari, Amin Ahsan Ali, Moushumi Sharmin, Andrew Raij, Karen Hovsepian, Syed Monowar Hossain, Emre Ertin, Ashley Kennedy, David H Epstein, Kenzie L Preston, Michelle Jobes, J Gayle Beck, Satish Kedia, Kenneth D Ward, Mustafa al'Absi, Santosh Kumar
{"title":"Are We There Yet? Feasibility of Continuous Stress Assessment via Wireless Physiological Sensors.","authors":"Mahbubur Rahman, Rummana Bari, Amin Ahsan Ali, Moushumi Sharmin, Andrew Raij, Karen Hovsepian, Syed Monowar Hossain, Emre Ertin, Ashley Kennedy, David H Epstein, Kenzie L Preston, Michelle Jobes, J Gayle Beck, Satish Kedia, Kenneth D Ward, Mustafa al'Absi, Santosh Kumar","doi":"10.1145/2649387.2649433","DOIUrl":"10.1145/2649387.2649433","url":null,"abstract":"<p><p>Stress can lead to headaches and fatigue, precipitate addictive behaviors (e.g., smoking, alcohol and drug use), and lead to cardiovascular diseases and cancer. Continuous assessment of stress from sensors can be used for timely delivery of a variety of interventions to reduce or avoid stress. We investigate the feasibility of continuous stress measurement via two field studies using wireless physiological sensors - a four-week study with illicit drug users (<i>n</i> = 40), and a one-week study with daily smokers and social drinkers (<i>n</i> = 30). We find that 11+ hours/day of usable data can be obtained in a 4-week study. Significant learning effect is observed after the first week and data yield is seen to be increasing over time even in the fourth week. We propose a framework to analyze sensor data yield and find that losses in wireless channel is negligible; the main hurdle in further improving data yield is the attachment constraint. We show the feasibility of measuring stress minutes preceding events of interest and observe the sensor-derived stress to be rising prior to self-reported stress and smoking events.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4374173/pdf/nihms-671146.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33047557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SimConcept: A Hybrid Approach for Simplifying Composite Named Entities in Biomedicine.","authors":"Chih-Hsuan Wei, Robert Leaman, Zhiyong Lu","doi":"10.1145/2649387.2649420","DOIUrl":"10.1145/2649387.2649420","url":null,"abstract":"<p><p>Many text-mining studies have focused on the issue of named entity recognition and normalization, especially in the field of biomedical natural language processing. However, entity recognition is a complicated and difficult task in biomedical text. One particular challenge is to identify and resolve composite named entities, where a single span refers to more than one concept(e.g., BRCA1/2). Most bioconcept recognition and normalization studies have either ignored this issue, used simple ad-hoc rules, or only handled coordination ellipsis, which is only one of the many types of composite mentions studied in this work. No systematic methods for simplifying composite mentions have been previously reported, making a robust approach greatly needed. To this end, we propose a hybrid approach by integrating a machine learning model with a pattern identification strategy to identify the antecedent and conjuncts regions of a concept mention, and then reassemble the composite mention using those identified regions. Our method, which we have named SimConcept, is the first method to systematically handle most types of composite mentions. Our method achieves high performance in identifying and resolving composite mentions for three fundamental biological entities: genes (89.29% in F-measure), diseases (85.52% in F-measure) and chemicals (84.04% in F-measure). Furthermore, our results show that, using our SimConcept method can subsequently help improve the performance of gene and disease concept recognition and normalization.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4384177/pdf/nihms673019.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33193039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated miRNA and mRNA Analysis of Time Series Microarray Data.","authors":"Julian Dymacek, Nancy Lan Guo","doi":"10.1145/2649387.2649411","DOIUrl":"https://doi.org/10.1145/2649387.2649411","url":null,"abstract":"<p><p>The dynamic temporal regulatory effects of microRNA are not well known. We introduce a technique for integrating miRNA and mRNA time series microarray data with known disease pathology. The integrated analysis includes identifying both mRNA and miRNA that are signi cantly similar to the quantitative pathology. Potential regulatory miRNA/mRNA target pairs are identi ed through databases of both predicted and validated pairs. Finally, potential target pairs are ltered by examining the second derivatives of the fold changes over time. Our system was used on genome-wide microarray expression data of mouse lungs (<i>n</i> = 160) following aspiration of multi-walled carbon nanotubes. This system shows promise of readily identifying miRNA for further study as potential biomarker use.</p>","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2649387.2649411","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33315379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}