ClusPhylo: Spark Based Fast and Reliable Approach for Reconstruction of Phylogenetic Network Using Large Databases

Shamita Malik, S. Khatri, Dolly Sharma
{"title":"ClusPhylo: Spark Based Fast and Reliable Approach for Reconstruction of Phylogenetic Network Using Large Databases","authors":"Shamita Malik, S. Khatri, Dolly Sharma","doi":"10.15412/J.JBTW.01060602","DOIUrl":null,"url":null,"abstract":"Phylogenetic examination has turned out to be fundamental part of investigation for evolution of “tree of life”. This investigation is most vital in logical research for development of life; it is a measure of impressions among creatures. It is important during examination that is required in process of arranging scattered information. Due to the expansion of more information in the field of proteomics, the computational biology algorithms should be extremely productive and near to accuracy. The inference of expansive and precise phylogenetic trees has expanded in most recent couple of years. Early methodologies for phylogenetic derivation depended on single processor PCs. Nonetheless, for expansive number of taxa, it is not feasible to utilize single processor. This represents a test for more proficient and adaptable calculations that utilizates parallel and conveyed processing for phylogenetic surmising. In this research paper, a new algorithm ClusPhylo based on clusters is introduced for large datasets. The proposed algorithms upgrades tree development issue by partitioning input arrangement into groups builds beginning sub-trees from arrangements of clusters and consolidations sub-trees into a solitary tree by additive methodology. ClusPhylo is implemented on Apache Spark. The execution of calculation as far as conclusive log probability qualities and execution time is contrasted with understood calculations. The outcome comes about demonstrating that the proposed calculation is computationally effective, delivers better probability values and is versatile on fluctuating number of processors too.","PeriodicalId":119340,"journal":{"name":"Journal of Biology and Today`s World","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biology and Today`s World","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15412/J.JBTW.01060602","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Phylogenetic examination has turned out to be fundamental part of investigation for evolution of “tree of life”. This investigation is most vital in logical research for development of life; it is a measure of impressions among creatures. It is important during examination that is required in process of arranging scattered information. Due to the expansion of more information in the field of proteomics, the computational biology algorithms should be extremely productive and near to accuracy. The inference of expansive and precise phylogenetic trees has expanded in most recent couple of years. Early methodologies for phylogenetic derivation depended on single processor PCs. Nonetheless, for expansive number of taxa, it is not feasible to utilize single processor. This represents a test for more proficient and adaptable calculations that utilizates parallel and conveyed processing for phylogenetic surmising. In this research paper, a new algorithm ClusPhylo based on clusters is introduced for large datasets. The proposed algorithms upgrades tree development issue by partitioning input arrangement into groups builds beginning sub-trees from arrangements of clusters and consolidations sub-trees into a solitary tree by additive methodology. ClusPhylo is implemented on Apache Spark. The execution of calculation as far as conclusive log probability qualities and execution time is contrasted with understood calculations. The outcome comes about demonstrating that the proposed calculation is computationally effective, delivers better probability values and is versatile on fluctuating number of processors too.
ClusPhylo:基于Spark的大型数据库系统发育网络快速可靠重建方法
系统发育检查已成为研究“生命之树”进化的基础。这种研究在生命发展的逻辑研究中是最重要的;它是衡量生物之间印象的尺度。在整理零散信息的过程中,这是很重要的。随着蛋白质组学领域信息量的不断增加,计算生物学算法的效率越来越高,并且越来越接近于准确性。在最近几年里,广泛而精确的系统发育树的推论得到了扩展。早期的系统发育推导方法依赖于单处理器pc。然而,对于数量庞大的分类群,单处理机是不可行的。这代表了一个更熟练和适应性计算的测试,利用并行和传递处理进行系统发育推测。本文提出了一种基于聚类的大型数据集聚类分类算法。提出的算法通过将输入排列划分为组,从簇的排列中构建起始子树,并通过加性方法将子树合并为孤立树,从而解决了树的发展问题。ClusPhylo是在Apache Spark上实现的。就结论性对数概率质量和执行时间而言,计算的执行与理解的计算进行了对比。结果表明,所提出的计算在计算上是有效的,提供了更好的概率值,并且在处理器数量波动的情况下也是通用的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信