International Journal of Data Mining and Bioinformatics最新文献

筛选
英文 中文
Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm. Smith-Waterman算法的MPI-CUDA混合模型的设计与实现。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.069710
Heba Khaled, Hossam El Deen Mostafa Faheem, Rania El Gohary
{"title":"Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.","authors":"Heba Khaled,&nbsp;Hossam El Deen Mostafa Faheem,&nbsp;Rania El Gohary","doi":"10.1504/ijdmb.2015.069710","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069710","url":null,"abstract":"<p><p>This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069710","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34125296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Two stages weighted sampling strategy for detecting the relation between gene expression and disease. 基因表达与疾病关系的两阶段加权抽样检测策略。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.069417
Chih-Chung Yang, Wen-Shin Lin, Chien-Pang Lee, Yungho Leu
{"title":"Two stages weighted sampling strategy for detecting the relation between gene expression and disease.","authors":"Chih-Chung Yang,&nbsp;Wen-Shin Lin,&nbsp;Chien-Pang Lee,&nbsp;Yungho Leu","doi":"10.1504/ijdmb.2015.069417","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069417","url":null,"abstract":"<p><p>For microarray data analysis, most of them focus on selecting relevant genes and calculating the classification accuracy by the selected relevant genes. This paper wants to detect the relation between the gene expression levels and the classes of a cancer (or a disease) to assist researchers for initial diagnosis. The proposed method is called a Two Stages Weighted Sampling strategy (TSWS strategy). According to the results, the performance of TSWS strategy is better than other existing methods in terms of the classification accuracy and the number of selected relevant genes. Furthermore, TSWS strategy also can use to understand and detect the relation between the gene expression levels and the classes of a cancer (or a disease).</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069417","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34123512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Named entity recognition and classification in biomedical text using classifier ensemble. 基于分类器集成的生物医学文本命名实体识别与分类。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067954
Sriparna Saha, Asif Ekbal, Utpal Kumar Sikdar
{"title":"Named entity recognition and classification in biomedical text using classifier ensemble.","authors":"Sriparna Saha,&nbsp;Asif Ekbal,&nbsp;Utpal Kumar Sikdar","doi":"10.1504/ijdmb.2015.067954","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.067954","url":null,"abstract":"<p><p>Named Entity Recognition and Classification (NERC) is an important task in information extraction for biomedicine domain. Biomedical Named Entities include mentions of proteins, genes, DNA, RNA, etc. which, in general, have complex structures and are difficult to recognise. In this paper, we propose a Single Objective Optimisation based classifier ensemble technique using the search capability of Genetic Algorithm (GA) for NERC in biomedical texts. Here, GA is used to quantify the amount of voting for each class in each classifier. We use diverse classification methods like Conditional Random Field and Support Vector Machine to build a number of models depending upon the various representations of the set of features and/or feature templates. The proposed technique is evaluated with two benchmark datasets, namely JNLPBA 2004 and GENETAG. Experiments yield the overall F- measure values of 75.97% and 95.90%, respectively. Comparisons with the existing systems show that our proposed system achieves state-of-the-art performance.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.067954","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34145685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
LMDS-based approach for efficient top-k local ligand-binding site search. 基于lmds的top-k局部配体结合位点高效搜索方法。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.070066
Sungchul Kim, Lee Sael, Hwanjo Yu
{"title":"LMDS-based approach for efficient top-k local ligand-binding site search.","authors":"Sungchul Kim,&nbsp;Lee Sael,&nbsp;Hwanjo Yu","doi":"10.1504/ijdmb.2015.070066","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.070066","url":null,"abstract":"<p><p>In this work, we propose a LMDS-based binding-site search for improving the search speed of the Patch-Surfer method. Patch-Surfer is efficient in recognition of protein-ligand binding partners, further speedup is necessary to address multiple-user access. Futher speedup is realised by exploiting Landmark Multi-Dimensional Scaling (LMDS). It computes embedding coordinates for data points based on their distances from landmark points. When selecting the landmark points, we adopt two approaches--random and greedy selection. Our method approximately retrieves top-k results and the accuracy increases as we exploit more landmark points. Although two landmark selection approaches show comparable results, the greedy selection shows the best performance when the number of landmark points is large. Using our method, the searching time is reduced up to 99% and it retrieves almost 80% of exact top-k results. Additionally, LMDS-based binding-site search+ improves the retrieval accuracy from 80% to 95% while sacrificing the speedup ratio from 99% to 90% compared to Patch-Surfer.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.070066","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34192166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gene microarray data analysis using parallel point-symmetry-based clustering. 基于并行点对称聚类的基因微阵列数据分析。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067320
Anasua Sarkar, Ujjwal Maulik
{"title":"Gene microarray data analysis using parallel point-symmetry-based clustering.","authors":"Anasua Sarkar,&nbsp;Ujjwal Maulik","doi":"10.1504/ijdmb.2015.067320","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.067320","url":null,"abstract":"<p><p>Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or non-convex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetry-based K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based K-Means algorithm is compared with another new parallel symmetry-based K-Means and existing parallel K-Means over eight artificial and benchmark microarray data sets, to demonstrate its superiority, in both timing and validity. The statistical analysis is also performed to establish the significance of this message-passing-interface based point-symmetry K-Means implementation. We also analysed the biological relevance of clustering solutions.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.067320","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33973325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Predicting malignancy from mammography findings and image-guided core biopsies. 从乳房x光检查结果和图像引导的核心活检预测恶性肿瘤。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067319
Pedro Ferreira, Nuno A Fonseca, Inês Dutra, Ryan Woods, Elizabeth Burnside
{"title":"Predicting malignancy from mammography findings and image-guided core biopsies.","authors":"Pedro Ferreira,&nbsp;Nuno A Fonseca,&nbsp;Inês Dutra,&nbsp;Ryan Woods,&nbsp;Elizabeth Burnside","doi":"10.1504/ijdmb.2015.067319","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.067319","url":null,"abstract":"<p><p>The main goal of this work is to produce machine learning models that predict the outcome of a mammography from a reduced set of annotated mammography findings. In the study we used a dataset consisting of 348 consecutive breast masses that underwent image guided core biopsy performed between October 2005 and December 2007 on 328 female subjects. We applied various algorithms with parameter variation to learn from the data. The tasks were to predict mass density and to predict malignancy. The best classifier that predicts mass density is based on a support vector machine and has accuracy of 81.3%. The expert correctly annotated 70% of the mass densities. The best classifier that predicts malignancy is also based on a support vector machine and has accuracy of 85.6%, with a positive predictive value of 85%. One important contribution of this work is that our model can predict malignancy in the absence of the mass density attribute, since we can fill up this attribute using our mass density predictor.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.067319","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33973324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Mining literatures to discover novel multiple biological associations in a disease context. 挖掘文献,发现疾病背景下新的多重生物学关联。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.069419
Alberto Faro, Daniela Giordano, Francesco Maiorana
{"title":"Mining literatures to discover novel multiple biological associations in a disease context.","authors":"Alberto Faro,&nbsp;Daniela Giordano,&nbsp;Francesco Maiorana","doi":"10.1504/ijdmb.2015.069419","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069419","url":null,"abstract":"The text mining methods proposed to discover associations between pairs of biological entities by mining a scientific literature often extract associations already existing in the literature, whereas their extensions supervise too much the discovery process with heuristics and ontologies that limit the research space. On the other hand, the methods that search novel associations applying the text mining methods to two literatures do not avoid the risk of discovering syllogisms based on faulty premises. For this reason, the paper proposes a method that helps the users to discover associations among biological entities by mining the literature using an unsupervised clustering approach. The discovered multiple associations are derived from binary associations to limit the computational load without compromising the methodology accuracy. A case study demonstrates how the tool derived from the methodology works in practice. A comparison between this tool and other tools available in the literature points out the methodology effectiveness.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069419","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34123513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fuzzy watershed segmentation algorithm: an enhanced algorithm for 2D gel electrophoresis image segmentation. 模糊分水岭分割算法:一种改进的二维凝胶电泳图像分割算法。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.069659
Shaheera Rashwan, Amany Sarhan, Muhamed Talaat Faheem, Bayumy A Youssef
{"title":"Fuzzy watershed segmentation algorithm: an enhanced algorithm for 2D gel electrophoresis image segmentation.","authors":"Shaheera Rashwan,&nbsp;Amany Sarhan,&nbsp;Muhamed Talaat Faheem,&nbsp;Bayumy A Youssef","doi":"10.1504/ijdmb.2015.069659","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069659","url":null,"abstract":"<p><p>Detection and quantification of protein spots is an important issue in the analysis of two-dimensional electrophoresis images. However, there is a main challenge in the segmentation of 2DGE images which is to separate overlapping protein spots correctly and to find the weak protein spots. In this paper, we describe a new robust technique to segment and model the different spots present in the gels. The watershed segmentation algorithm is modified to handle the problem of over-segmentation by initially partitioning the image to mosaic regions using the composition of fuzzy relations. The experimental results showed the effectiveness of the proposed algorithm to overcome the over segmentation problem associated with the available algorithm. We also use a wavelet denoising function to enhance the quality of the segmented image. The results of using a denoising function before the proposed fuzzy watershed segmentation algorithm is promising as they are better than those without denoising.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069659","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34125294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Patient-specific early classification of multivariate observations. 多变量观察的患者特异性早期分类。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.067955
Mohamed F Ghalwash, Dušan Ramljak, Zoran Obradović
{"title":"Patient-specific early classification of multivariate observations.","authors":"Mohamed F Ghalwash,&nbsp;Dušan Ramljak,&nbsp;Zoran Obradović","doi":"10.1504/ijdmb.2015.067955","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.067955","url":null,"abstract":"<p><p>Early classification of time series has been receiving a lot of attention recently. In this paper we present a model, which we call the Early Classification Model (ECM), that allows for early, accurate and patient-specific classification of multivariate observations. ECM is comprised of an integration of the widely used Hidden Markov Model (HMM) and Support Vector Machine (SVM) models. It attained very promising results on the datasets we tested it on: in one set of experiments based on a published dataset of response to drug therapy in Multiple Sclerosis patients, ECM used only an average of 40% of a time series and was able to outperform some of the baseline models, which needed the full time series for classification. In the set of experiments tested on a sepsis therapy dataset, ECM was able to surpass the standard threshold-based method and the state-of-the-art method for early classification of multivariate time series.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.067955","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34145686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A system biology approach for understanding the miRNA regulatory network in colon rectal cancer. 用系统生物学方法了解结肠癌miRNA调控网络。
IF 0.3 4区 生物学
International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI: 10.1504/ijdmb.2015.066332
Meeta Pradhan, Kshithija Nagulapalli, Lakenvia Ledford, Yogesh Pandit, Mathew Palakal
{"title":"A system biology approach for understanding the miRNA regulatory network in colon rectal cancer.","authors":"Meeta Pradhan,&nbsp;Kshithija Nagulapalli,&nbsp;Lakenvia Ledford,&nbsp;Yogesh Pandit,&nbsp;Mathew Palakal","doi":"10.1504/ijdmb.2015.066332","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.066332","url":null,"abstract":"<p><p>In this paper we present a systems biology approach to the understanding of the miRNA-regulatory network in colon rectal cancer. An initial set of significant genes in Colon Rectal Cancer (CRC) were obtained by mining relevant literature. An initial set of cancer-related miRNAs were obtained from three databases: miRBase, miRWalk, Targetscan and GEO microarray experiment. First principle methods were then used to generate the global miRNA-gene network. Significant miRNAs and associated transcription factors in the global miRNA-gene network were identified using topological and sub-graph analyses. Eleven novel miRNAs were identified and three of the novel miRNAs, hsa-miR-630, hsa-miR-100 and hsa-miR-99a, were further analysed to elucidate their role in CRC. The proposed methodology effectively made use of literature data and was able to show novel, significant miRNA-transcription associations in CRC.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.066332","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33973460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信