{"title":"Accelerating protein-protein complex validation by GPU based funnel generation","authors":"Michael Zabejansky, H. Wolfson","doi":"10.1109/BIBM.2016.7822510","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822510","url":null,"abstract":"A major challenge in protein-protein docking is the distinction between near-native and decoy complex predictions. It has been shown that near native solutions are usually located at the bottom of deep and densely populated funnels in the binding energy plot of the complex. Thus exploration, whether the energy plot of the vicinity of a docking solution is “funnel like”, can serve as a validation of such a solution. Generation of such densely sampled plots, however, is a major computational challenge. We have designed an accurate and highly efficient parallel algorithm for generation of such energy plots and implemented it on a server with 4 GPU processors, each with 2880 cores. The algorithm achieved a speedup of about 150 compared to its serial counterpart, while even outperforming it in the achieved results. While the algorithm proved very useful for near native complex hypothesis validation, it still detects many funnels for decoy solutions, especially those with good shape complementarity.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"529 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132317809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on early risk predictive model and discriminative feature selection of cancer based on real-world routine physical examination data","authors":"Guixia Kang, Zhuang Ni","doi":"10.1109/BIBM.2016.7822746","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822746","url":null,"abstract":"most cancers at early stages show no obvious symptoms and curative treatment is not an option any more when cancer is diagnosed. Therefore, making accurate predictions for the risk of early cancer has become urgently necessary in the field of medicine. In this paper, our purpose is to fully utilize real-world routine physical examination data to analyze the most discriminative features of cancer based on ReliefF algorithm and generate early risk predictive model of cancer taking advantage of three machine learning (ML) algorithms. We use physical examination data with a return visit followed 1 month later derived from CiMing Health Checkup Center. The ReliefF algorithm selects the top 30 features written as Sub(30) based on weight value from our data collections consisting of 34 features and 2300 candidates. The 4-layer (2 hidden layers) deep neutral network (DNN) based on B-P algorithm, the support machine vector with the linear kernel and decision tree CART are proposed for predicting the risk of cancer by 5-fold cross validation. We implement these criteria such as predictive accuracy, AUC-ROC, sensitivity and specificity to identify the discriminative ability of three proposed method for cancer. The results show that compared with the other two methods, SVM obtains higher AUC and specificity of 0.926 and 95.27%, respectively. The superior predictive accuracy (86%) is achieved by DNN. Moreover, the fuzzy interval of threshold in DNN is proposed and the sensitivity, specificity and accuracy of DNN is 90.20%, 94.22% and 93.22%, respectively, using the revised threshold interval. The research indicates that the application of ML methods together with risk feature selection based on real-world routine physical examination data is meaningful and promising in the area of cancer prediction.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133780512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Vizza, P. Guzzi, P. Veltri, G. Cascini, R. Curia, Loredana Sisca
{"title":"GIDAC: A prototype for bioimages annotation and clinical data integration","authors":"P. Vizza, P. Guzzi, P. Veltri, G. Cascini, R. Curia, Loredana Sisca","doi":"10.1109/BIBM.2016.7822663","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822663","url":null,"abstract":"The analysis of bioimages and their correlated clinical patient information allows to investigate specific diseases and define the corresponding medical protocols. To perform a correct diagnosis and apply a precise therapy, bioimages must be collected and studied together with others relevant data as well as laboratory results, medical annotations and patient history. Today, the management of these data is performed by single systems inside hospital departments that often do not provide dedicated data integration platforms among different departments as well as different health structures to exchange of relevant clinical information. Also, images cannot be annotated or enriched by physicians to trace temporal studies for patients or even among patients with similar diseases. In this contribution, we report the results of a research project called GIDAC (standing for Gestione Integrata DAti Clinici) that aims to define a general purpose framework for the bioimages management and annotations as well as clinical data view and integration in a simple-to-use information system. The proposed framework does not substitute any existing clinical information system but is able in gathering and integrating data by using a XML-based module. The novelty also consists in allowing annotations on DICOM images by means of simple user-interface to take trace of changes intra images as well as comparisons among patients. This system supports oncologists in the management of DICOM images from different devices (e.g., ecograph or PACS) to extract relevant information necessary to query (annotate) images and study similar clinical cases.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132232367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingyu Chen, Yu Wan, Yang Lei, J. Zobel, Karin M. Verspoor
{"title":"Evaluation of CD-HIT for constructing non-redundant databases","authors":"Qingyu Chen, Yu Wan, Yang Lei, J. Zobel, Karin M. Verspoor","doi":"10.1109/BIBM.2016.7822604","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822604","url":null,"abstract":"CD-HIT is one of the most popular tools for reducing sequence redundancy, and is considered to be the state-of-art method. It tries to minimise redundancy by reducing an input database into several representative sequences, under a user-defined threshold of sequence identity. We present a comprehensive assessment of the redundancy in the outputs of CD-HIT, exploring the impact of different identity thresholds and new evaluation data on the redundancy. We demonstrate that the relationship between threshold and redundancies is surprising weak. Applications of CD-HIT that set low identity threshold values also may suffer from substantial degradation in both efficiency and accuracy.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"65 Suppl 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133375890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianjun Shen, Jin Zhou, Xingpeng Jiang, Xiaohua Hu, Tingting He, Jincai Yang, Dan Xie
{"title":"A novel identified temporal protein complexes strategy inspired by density-distance and brainstorming process","authors":"Xianjun Shen, Jin Zhou, Xingpeng Jiang, Xiaohua Hu, Tingting He, Jincai Yang, Dan Xie","doi":"10.1109/BIBM.2016.7822701","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822701","url":null,"abstract":"Detection of protein complexes and functional modules plays a crucial role for strengthening the comprehension of cellular organization and biological functions on the dynamic protein-protein interaction network. In this article, we put forward a new strategy to identify temporal protein complexes. Integrating time-course gene expression data into static protein interaction data, a series of time-sequenced subnetworks were constructed. Then we combined the network topology and gene ontology information for defining the distance between proteins in PPI network. A novel method to find the cluster centers and then form initial clusters was based on the idea that cluster centers are usually recognized as nodes with higher densities than their neighbors and with a relatively larger distance from other cluster centers. Finally, inspired by the brainstorming discussion process, two ways are introduced to update the initial clusters for achieving the optimal results. After the filtering and merging procedure, experimental results demonstrated that the proposed strategy had a good performance comparing with the other four advanced algorithms - MCODE, FAG-EC, HC-PIN, and CNC.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132928378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Changsheng Zhang, Hongmin Cai, Jingying Huang, Bo Xu
{"title":"Multi-norm constrained optimization methods for calling copy number variants in single cell sequencing data","authors":"Changsheng Zhang, Hongmin Cai, Jingying Huang, Bo Xu","doi":"10.1109/BIBM.2016.7822511","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822511","url":null,"abstract":"The revolutionary invention of single-cell sequencing technology carves out a new way to delineate intra tumor heterogeneity and the evolution of single cells at the molecular level. Since single-cell sequencing requires a special genome amplification step to accumulate enough samples, a large number of bias were introduced, making the calling of copy number variants rather challenging. Accurately modeling this process and effectively detecting copy number variations (CNVs) are the major roadblock for single-cell sequencing data analysis. Recent advances manifested that the underlying copy numbers are corrupted by noise, which could be approximated by negative binomial distribution. In this paper, we formulated a general mathematical model for copy number reconstruction from read depth signal, and presented its two specific variants, namely Poisson-CNV and NB-CNV to catering for various reads distribution. Efficient numerical solution based on the classical alternating direction minimization method was designed to solve the proposed models. Extensive experiments on both synthetic datasets and empirical single-cell sequencing datasets were conducted to compare the performance of the two models. The results show that the proposed model of NB-CNV achieved superior performance in calling the CNV for single-cell sequencing data.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116640622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unconstrained optimization in projection method for indefinite SVMs","authors":"Hao Jiang, W. Ching, Yushan Qiu, Xiaoqing Cheng","doi":"10.1109/BIBM.2016.7822585","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822585","url":null,"abstract":"Positive semi-definiteness is a critical property in Support Vector Machine (SVM) methods to ensure efficient solutions through convex quadratic programming. In this paper, we introduce a projection matrix on indefinite kernels to formulate a positive semi-definite one. The proposed model can be regarded as a generalized version of the spectrum method (denoising method and flipping method) by varying parameter λ. In particular, our suggested optimal λ under the Bregman matrix divergence theory can be obtained using unconstrained optimization. Experimental results on 4 real world data sets ranging from glycan classification to cancer prediction show that the proposed model can achieve better or competitive performance when compared to the related indefinite kernel methods. This may suggest a new way in motif extractions or cancer predictions.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116833785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A representational analysis of a temporal indeterminancy display in clinical events","authors":"M. Madkour, Hsing-yi Song, Jingcheng Du, Cui Tao","doi":"10.1109/BIBM.2016.7822673","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822673","url":null,"abstract":"This paper describes a proposition for representing temporal indeterminacy in events from clinical narratives using fuzzy sets membership functions. This approach leverages both temporal and semantic information of events and has been proved by representational analysis evaluation method. We demonstrate that membership functions' graphs can be used for representing temporal approximation and granularity of events. We also show that this approach is helpful for the construction of fine timeline of clinical events, and can be used for calculating accurate metrics for ordering events.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116688591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual orchestration and autonomous execution of distributed and heterogeneous computational biology pipelines","authors":"Xin Mou, H. Jamil, R. Rinker","doi":"10.1109/BIBM.2016.7822615","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822615","url":null,"abstract":"Data integration continues to baffle researchers even though substantial progress has been made. Although the emergence of technologies such as XML, web services, semantic web and cloud computing have helped, a system in which biologists are comfortable articulating new applications and developing them without technical assistance from a computing expert is yet to be realized. The distance between a friendly graphical interface that does little, and a “traditional” system though clunky yet powerful, is deemed too great more often than not. The question that remains unanswered is, if a user can state her query involving a set of complex, heterogeneous and distributed life sciences resources in an easy to use language and execute it without further help from a computer savvy programmer. In this paper, we present a declarative meta-language, called VisFlow, for requirement specification, and a translator for mapping requirements into executable queries in a variant of SQL augmented with integration artifacts.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116451952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yangyang Hu, Wenqiang Zhang, Hong Lu, Fufeng Li, Weifei Zhang
{"title":"Wide line detection with water flow","authors":"Yangyang Hu, Wenqiang Zhang, Hong Lu, Fufeng Li, Weifei Zhang","doi":"10.1109/BIBM.2016.7822715","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822715","url":null,"abstract":"Line detection plays a vital role in visual analysis tasks like Traditional Chinese Medicine (TCM) image analytics. However, most of the current methods ignore line thickness and perform poorly for the lines with different widths. This paper proposes a novel line detection method by using the water flow method. Unlike most edge-based and region-based line detectors, the water flow method is applied to obtaining the whole line response map by simply imitating the movement of water in the image smoothed by guided filter, which is viewed as a geomorphological map. In addition, this paper also proposes an adaptive parameter selection method so that the line detection can be more robust and accurate. Experimental results demonstrate the effectiveness of the proposed method on tongue crack images in comparison to the existing line extraction methods.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115087121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}