International Journal of Data Mining and Bioinformatics最新文献_第6页

Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm. Smith-Waterman算法的MPI-CUDA混合模型的设计与实现。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.069710

Heba Khaled, Hossam El Deen Mostafa Faheem, Rania El Gohary

{"title":"Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.","authors":"Heba Khaled, Hossam El Deen Mostafa Faheem, Rania El Gohary","doi":"10.1504/ijdmb.2015.069710","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069710","url":null,"abstract":"This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"12 3","pages":"313-27"},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069710","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34125296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Two stages weighted sampling strategy for detecting the relation between gene expression and disease. 基因表达与疾病关系的两阶段加权抽样检测策略。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.069417

Chih-Chung Yang, Wen-Shin Lin, Chien-Pang Lee, Yungho Leu

引用次数: 2

Named entity recognition and classification in biomedical text using classifier ensemble. 基于分类器集成的生物医学文本命名实体识别与分类。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067954

Sriparna Saha, Asif Ekbal, Utpal Kumar Sikdar

{"title":"Named entity recognition and classification in biomedical text using classifier ensemble.","authors":"Sriparna Saha, Asif Ekbal, Utpal Kumar Sikdar","doi":"10.1504/ijdmb.2015.067954","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.067954","url":null,"abstract":"Named Entity Recognition and Classification (NERC) is an important task in information extraction for biomedicine domain. Biomedical Named Entities include mentions of proteins, genes, DNA, RNA, etc. which, in general, have complex structures and are difficult to recognise. In this paper, we propose a Single Objective Optimisation based classifier ensemble technique using the search capability of Genetic Algorithm (GA) for NERC in biomedical texts. Here, GA is used to quantify the amount of voting for each class in each classifier. We use diverse classification methods like Conditional Random Field and Support Vector Machine to build a number of models depending upon the various representations of the set of features and/or feature templates. The proposed technique is evaluated with two benchmark datasets, namely JNLPBA 2004 and GENETAG. Experiments yield the overall F- measure values of 75.97% and 95.90%, respectively. Comparisons with the existing systems show that our proposed system achieves state-of-the-art performance.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"11 4","pages":"365-91"},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.067954","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34145685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

LMDS-based approach for efficient top-k local ligand-binding site search. 基于lmds的top-k局部配体结合位点高效搜索方法。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.070066

Sungchul Kim, Lee Sael, Hwanjo Yu

{"title":"LMDS-based approach for efficient top-k local ligand-binding site search.","authors":"Sungchul Kim, Lee Sael, Hwanjo Yu","doi":"10.1504/ijdmb.2015.070066","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.070066","url":null,"abstract":"In this work, we propose a LMDS-based binding-site search for improving the search speed of the Patch-Surfer method. Patch-Surfer is efficient in recognition of protein-ligand binding partners, further speedup is necessary to address multiple-user access. Futher speedup is realised by exploiting Landmark Multi-Dimensional Scaling (LMDS). It computes embedding coordinates for data points based on their distances from landmark points. When selecting the landmark points, we adopt two approaches--random and greedy selection. Our method approximately retrieves top-k results and the accuracy increases as we exploit more landmark points. Although two landmark selection approaches show comparable results, the greedy selection shows the best performance when the number of landmark points is large. Using our method, the searching time is reduced up to 99% and it retrieves almost 80% of exact top-k results. Additionally, LMDS-based binding-site search+ improves the retrieval accuracy from 80% to 95% while sacrificing the speedup ratio from 99% to 90% compared to Patch-Surfer.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"12 4","pages":"417-33"},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.070066","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34192166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gene microarray data analysis using parallel point-symmetry-based clustering. 基于并行点对称聚类的基因微阵列数据分析。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067320

Anasua Sarkar, Ujjwal Maulik

{"title":"Gene microarray data analysis using parallel point-symmetry-based clustering.","authors":"Anasua Sarkar, Ujjwal Maulik","doi":"10.1504/ijdmb.2015.067320","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.067320","url":null,"abstract":"Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or non-convex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetry-based K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based K-Means algorithm is compared with another new parallel symmetry-based K-Means and existing parallel K-Means over eight artificial and benchmark microarray data sets, to demonstrate its superiority, in both timing and validity. The statistical analysis is also performed to establish the significance of this message-passing-interface based point-symmetry K-Means implementation. We also analysed the biological relevance of clustering solutions.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"11 3","pages":"277-300"},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.067320","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33973325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Patient-specific early classification of multivariate observations. 多变量观察的患者特异性早期分类。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067955

Mohamed F Ghalwash, Dušan Ramljak, Zoran Obradović

引用次数: 11

Predicting malignancy from mammography findings and image-guided core biopsies. 从乳房x光检查结果和图像引导的核心活检预测恶性肿瘤。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067319

Pedro Ferreira, Nuno A Fonseca, Inês Dutra, Ryan Woods, Elizabeth Burnside

{"title":"Predicting malignancy from mammography findings and image-guided core biopsies.","authors":"Pedro Ferreira, Nuno A Fonseca, Inês Dutra, Ryan Woods, Elizabeth Burnside","doi":"10.1504/ijdmb.2015.067319","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.067319","url":null,"abstract":"The main goal of this work is to produce machine learning models that predict the outcome of a mammography from a reduced set of annotated mammography findings. In the study we used a dataset consisting of 348 consecutive breast masses that underwent image guided core biopsy performed between October 2005 and December 2007 on 328 female subjects. We applied various algorithms with parameter variation to learn from the data. The tasks were to predict mass density and to predict malignancy. The best classifier that predicts mass density is based on a support vector machine and has accuracy of 81.3%. The expert correctly annotated 70% of the mass densities. The best classifier that predicts malignancy is also based on a support vector machine and has accuracy of 85.6%, with a positive predictive value of 85%. One important contribution of this work is that our model can predict malignancy in the absence of the mass density attribute, since we can fill up this attribute using our mass density predictor.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"11 3","pages":"257-76"},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.067319","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33973324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Mining literatures to discover novel multiple biological associations in a disease context. 挖掘文献，发现疾病背景下新的多重生物学关联。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.069419

Alberto Faro, Daniela Giordano, Francesco Maiorana

{"title":"Mining literatures to discover novel multiple biological associations in a disease context.","authors":"Alberto Faro, Daniela Giordano, Francesco Maiorana","doi":"10.1504/ijdmb.2015.069419","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069419","url":null,"abstract":"The text mining methods proposed to discover associations between pairs of biological entities by mining a scientific literature often extract associations already existing in the literature, whereas their extensions supervise too much the discovery process with heuristics and ontologies that limit the research space. On the other hand, the methods that search novel associations applying the text mining methods to two literatures do not avoid the risk of discovering syllogisms based on faulty premises. For this reason, the paper proposes a method that helps the users to discover associations among biological entities by mining the literature using an unsupervised clustering approach. The discovered multiple associations are derived from binary associations to limit the computational load without compromising the methodology accuracy. A case study demonstrates how the tool derived from the methodology works in practice. A comparison between this tool and other tools available in the literature points out the methodology effectiveness.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"12 2","pages":"224-56"},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069419","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34123513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Fuzzy watershed segmentation algorithm: an enhanced algorithm for 2D gel electrophoresis image segmentation. 模糊分水岭分割算法:一种改进的二维凝胶电泳图像分割算法。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.069659

Shaheera Rashwan, Amany Sarhan, Muhamed Talaat Faheem, Bayumy A Youssef

{"title":"Fuzzy watershed segmentation algorithm: an enhanced algorithm for 2D gel electrophoresis image segmentation.","authors":"Shaheera Rashwan, Amany Sarhan, Muhamed Talaat Faheem, Bayumy A Youssef","doi":"10.1504/ijdmb.2015.069659","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069659","url":null,"abstract":"Detection and quantification of protein spots is an important issue in the analysis of two-dimensional electrophoresis images. However, there is a main challenge in the segmentation of 2DGE images which is to separate overlapping protein spots correctly and to find the weak protein spots. In this paper, we describe a new robust technique to segment and model the different spots present in the gels. The watershed segmentation algorithm is modified to handle the problem of over-segmentation by initially partitioning the image to mosaic regions using the composition of fuzzy relations. The experimental results showed the effectiveness of the proposed algorithm to overcome the over segmentation problem associated with the available algorithm. We also use a wavelet denoising function to enhance the quality of the segmented image. The results of using a denoising function before the proposed fuzzy watershed segmentation algorithm is promising as they are better than those without denoising.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"12 3","pages":"275-93"},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069659","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34125294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

ACC-FMD: ant colony clustering for functional module detection in protein-protein interaction networks. ACC-FMD:蛋白质相互作用网络中功能模块检测的蚁群聚类。

IF 0.3 4区生物学

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067323

Junzhong Ji, Hongxin Liu, Aidong Zhang, Zhijun Liu, Chunnian Liu

{"title":"ACC-FMD: ant colony clustering for functional module detection in protein-protein interaction networks.","authors":"Junzhong Ji, Hongxin Liu, Aidong Zhang, Zhijun Liu, Chunnian Liu","doi":"10.1504/ijdmb.2015.067323","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.067323","url":null,"abstract":"Mining functional modules in Protein-Protein Interaction (PPI) networks is a very important research for revealing the structure-functionality relationships in biological processes. More recently, some swarm intelligence algorithms have been successfully applied in the field. This paper presents a new nature-inspired approach, ACC-FMD, which is based on ant colony clustering to detect functional modules. First, some proteins with the higher clustering coefficients are, respectively, selected as ant seed nodes. And then, the picking and dropping operations based on ant probabilistic models are developed and employed to assign proteins into the corresponding clusters represented by seeds. Finally, the best clustering result in each generation is used to perform the information transmission by updating the similarly function. Experimental results on some benchmarked datasets show that ACC-FMD outperforms the CFinder and MCODE algorithms and has comparative performance with the MINE, COACH, DPClus and Core algorithms in terms of the general evaluation metrics.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"11 3","pages":"331-63"},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.067323","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34039167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13