Shang Gao, W. Zou, Yuanyuan Liu, Xingwang Wang, Y. Zhuang, X. Wei, R. Alhajj
{"title":"Integrating multiple sources of genomic data by multiplex network reconstruction","authors":"Shang Gao, W. Zou, Yuanyuan Liu, Xingwang Wang, Y. Zhuang, X. Wei, R. Alhajj","doi":"10.1109/BIBM.2015.7359926","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359926","url":null,"abstract":"In recent years, rapidly accumulating genomic data have posed a challenge to integrate multiple data sources and to analyze the integrated networks globally. In this paper we present a method to reverse engineer integrative gene networks. The main advantage of our method is the integration of different quantitative and qualitative data sets in order to reconstruct a multiplex network, without necessarily imposing data constraints, such as each genomic datum needs to have the same number of entities. The computation boils down to solving small quadratic programs based on local neighborhood of nodes. We applied the method to DREAM5 dataset, and compared the results with the community networks from the challenge. We further demonstrated our method through a case study using breast cancer data, integrating metastasis gene expression data with interactome data. Overall, our method can be applied in many settings of network system biology.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132826296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bolin Chen, Xuequn Shang, Min Li, Jianxin Wang, Fang-Xiang Wu
{"title":"A two-step logistic regression algorithm for identifying individual-cancer-related genes","authors":"Bolin Chen, Xuequn Shang, Min Li, Jianxin Wang, Fang-Xiang Wu","doi":"10.1109/BIBM.2015.7359680","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359680","url":null,"abstract":"The identification of cancer-related genes is important towards the understanding of complex genetic diseases. Although many machine learning algorithms are proposed to identify disease-related genes, they often either have poor performance to identify locus heterogeneity cancer-related genes or are not applicable to predict individual-disease-related genes due to the lack of positive instances (imbalanced classification). To overcome these two issues, a two-step logistic regression (LR) based algorithm is proposed in this study for identifying individual-cancer-related genes. A set of high potential cancer-class-related genes is first generated in step 1, followed by a second round of LR-based algorithm conducted on this smaller dataset for identifying individual-cancer-related genes. Numerical experiments show that the proposed two-step LR-based algorithm not only works well for locus heterogeneity data, but also has good performance to handle the imbalanced classification problem. The individual-cancer-related gene identification experiments achieve AUC values of around 0.85 when the threshold of posterior probability is chosen between 0.3 and 0.6. All evaluations are conducted by using the leave-one-out cross validation method.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134055657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finding Frequent Approximate Subgraphs in medical image database","authors":"Linlin Gao, Haiwei Pan, Qilong Han, Xiaoqin Xie, Zhiqiang Zhang, Xiao Zhai, Pengyuan Li","doi":"10.1109/BIBM.2015.7359821","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359821","url":null,"abstract":"Medical images are one of the most important tools in doctors' diagnostic decision-making. It has been a research hotspot in medical big data that how to effectively represent medical images and find essential patterns hidden in them to assist doctors to achieve a better diagnosis. Several graph models have been developed to represent medical images. However, the unique structures of domain-specific images are not considered well to lose some essential information. Thus, aiming at brain CT images, we first construct a graph about the Topological Relations between Ventricles and Lesions (TRVL) and present the graph modeling process. Then we propose a method named Frequent Approximate Subgraph Mining based on Graph Edit Distance (FASMGED). This method uses an error-tolerant graph matching strategy that is accordant with ubiquitous noise in practice. Experimental results show that the graph modeling process is computationally scalable and FASMGED can find more significant patterns than current algorithms.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134163210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahoi Jones, Hamid D. Ismail, J. H. Kim, R. Newman, B. K.C.Dukka
{"title":"RF-Phos: Random forest-based prediction of phosphorylation sites","authors":"Ahoi Jones, Hamid D. Ismail, J. H. Kim, R. Newman, B. K.C.Dukka","doi":"10.1109/BIBM.2015.7359670","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359670","url":null,"abstract":"It is estimated that about 30% of the proteins in the human proteome are regulated by phosphorylation. In recent years, phosphorylation site prediction has been investigated in the field of bioinformatics. This has become necessary due to the challenges associated with experimental methods. Previously, we developed a random forest-based method, termed Random Forest-based Phosphosite predictor (RF-Phos 1.0), to predict phosphorylation sites in proteins given only the amino acid sequence of a protein as input. Here, we report an improved version of this method, termed RF-Phos 1.1 that employs additional sequence-driven features to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation analysis and an independent dataset, RF-Phos 1.1 performs comparably to or better than other existing phosphosite prediction methods, such as PhosphoSVM, GPS2.1 and Musite.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"15 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133052942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hayda Almeida, Marie-Jean Meurs, Leila Kosseim, A. Tsang
{"title":"Supporting HIV literature screening with data sampling and supervised learning","authors":"Hayda Almeida, Marie-Jean Meurs, Leila Kosseim, A. Tsang","doi":"10.1109/BIBM.2015.7359733","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359733","url":null,"abstract":"This paper presents a supervised learning approach to support the screening of HIV literature. The manual screening of biomedical literature is an important task in the process of systematic reviews. Researchers and curators have the very demanding, time-consuming and error-prone task of manually identifying documents that must be included in a systematic review concerning a specific problem. We implemented a supervised learning approach to support screening tasks, by automatically flagging potentially selected documents in a list retrieved by a literature database search. To overcome the main issues associated with the automatic literature screening task, we evaluated the use of data sampling, feature combinations, and feature selection methods, generating a total of 105 classification models. The models yielding best results were composed by the Logistic Model Trees classifier, a fairly balanced training set, and feature combination of Bag-Of-Words and MeSH terms. According to our results, the system correctly labels the great majority of relevant documents, and it could be used to support HIV systematic reviews to allow researchers to assess a greater number of documents in less time.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133425509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haidong Lan, Weiguo Liu, B. Schmidt, Bingqiang Wang
{"title":"Accelerating large-scale biological database search on Xeon Phi-based neo-heterogeneous architectures","authors":"Haidong Lan, Weiguo Liu, B. Schmidt, Bingqiang Wang","doi":"10.1109/BIBM.2015.7359735","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359735","url":null,"abstract":"In this paper we present new parallelization techniques for searching large-scale biological sequence databases with the Smith-Waterman algorithm on Xeon Phi-based neoheterogenous architectures. In order to make full use of the compute power of both the multi-core CPU and the many-core Xeon Phi hardware, we use a collaborative computing scheme as well as hybrid parallelism. At the CPU side, we employ SSE intrinsics and multi-threading to implement SIMD parallelism. At the Xeon Phi side, we use Knights Corner vector instructions to gain more data parallelism. We have presented two dynamic task distribution schemes (thread level and device level) in order to achieve better load balancing. Furthermore, a multi-threaded asynchronous scheme is used to overlap communication and computation between CPUs and Xeon Phis. Evaluations on real protein sequence databases show that our method achieves a peak overall performance up to 220 GCUPS on a neo-heterogeneous platform consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of database size and query length. Our implementation is available at: http://turbo0628.github.io/LSBDS/.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122450048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Hu, Zhenyu Liu, Lihua Yan, Tian-Zhong Wang, Fei Liu, Xiaoyu Li, Huanyu Kang
{"title":"Feature selection and classification of speech under long-term stress","authors":"B. Hu, Zhenyu Liu, Lihua Yan, Tian-Zhong Wang, Fei Liu, Xiaoyu Li, Huanyu Kang","doi":"10.1109/BIBM.2015.7359804","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359804","url":null,"abstract":"Many studies were proposed to discuss acoustic correlates of stress in recent years. Considering some inconsistent experiment results, we supposed that stress should be categorized into long-term and short-term stress in this topic, and the trend of short-term stress induced by workload may be affected by long-term stress. This study contains three parts: first, we proposed an acoustic feature set chosen by feature selection, which can be considered as a measurement of the level of long-term stress; second, we showed that this set is immune to short-term stress in stress classification tests; finally, we observed some particular voice features mentioned in previous researches in our experiment and the results may imply that short-term stress trend is in connection with the level of long-term stress. In short, long-term and shot-term stress should be discussed separately in future researches for clear and explicit conclusions.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127766074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Usability testing of a novel automated external defibrillator user interface: A pilot study","authors":"R. Bond, P. O'Hare, R. D. Maio","doi":"10.1109/BIBM.2015.7359895","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359895","url":null,"abstract":"High quality CPR in conjunction with early defibrillation can enhance survival outcomes following cardiac arrest. This usability study aimed to evaluate a novel prototype user interface of a public access defibrillator for the improvement of CPR during a simulated resuscitation attempt. Test candidates were asked to use the device in the absence of audible instructions, relying solely on the device membrane, to perform CPR chest compressions. The rate of the rescuer chest compression were then assessed and evaluated according to current resuscitation guidelines. All participants improved their rate of CPR within-test. All but one achieved the correct compression rate within 20 seconds. This study suggests that the device interface could potentially enhance real-time CPR quality during resuscitation attempts.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127813788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Boosting compound-protein interaction prediction by deep learning","authors":"Kai Tian, Mingyu Shao, Shuigeng Zhou, J. Guan","doi":"10.1109/BIBM.2015.7359651","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359651","url":null,"abstract":"The identification of interactions between compounds and proteins plays an important role in network pharmacology and drug discovery. However, experimentally identifying compound-protein interactions (CPIs) is generally expensive and time-consuming, computational approaches are thus introduced. Among these, machine-learning based methods have achieved a considerable success. However, due to the nonlinear and imbalanced nature of biological data, many machine learning approaches have their own limitations. Recently, deep learning techniques show advantages over many state-of-the-art machine learning methods in many applications. In this study, we aim at improving the performance of CPI prediction based on deep learning, and propose a method called DL-CPI (the abbreviation of Deep Learning for Compound-Protein Interactions prediction), which employs deep neural network (DNN) to effectively learn the representations of compound-protein pairs. Extensive experiments show that DL-CPI can learn useful features of compound-protein pairs by a layerwise abstraction, and thus achieves better prediction performance than existing methods on both balanced and imbalanced datasets.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117278296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bo Xu, Hongfei Lin, Yuan Lin, Yunlong Ma, Liang Yang, Jian Wang, Zhihao Yang
{"title":"Learning to rank for biomedical information retrieval","authors":"Bo Xu, Hongfei Lin, Yuan Lin, Yunlong Ma, Liang Yang, Jian Wang, Zhihao Yang","doi":"10.1109/BIBM.2015.7359729","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359729","url":null,"abstract":"Research articles in biomedicine domain have increased exponentially, which makes it more and more difficult for biologists to manually capture all the information they need. Information retrieval technologies can help to obtain the users' needed information automatically. However, it is a great challenge to apply these technologies to biomedicine domain directly because of some domain specific characteristics, such as the abundance of terminologies. To enhance the effectiveness of the biomedical information retrieval, we propose a novel framework based on the state-of-the-art information retrieval methods, called learning to rank, which has been proved effective to rank documents based on their relevance degree. In the framework, we attempt to tackle the problem of the abundance of terminologies by constructing ranking models, which focus on not only retrieving the most relevant documents but also diversifying the searching results to increase the completeness of the resulting list for a given query. In the model training, we propose two novel document labeling strategies, and combine several traditional retrieval models as learning features. Besides, we also investigate the usefulness of different learning to rank approaches in our framework. Experimental results on TREC Genomics datasets demonstrate our proposed framework is effective in improving the performance of biomedical information retrieval.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115268238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}