A Maintainable Software Architecture for Fast and Modular Bioinformatics Sequence Search

J. Archuleta, E. Tilevich, Wu-chun Feng
{"title":"A Maintainable Software Architecture for Fast and Modular Bioinformatics Sequence Search","authors":"J. Archuleta, E. Tilevich, Wu-chun Feng","doi":"10.1109/ICSM.2007.4362627","DOIUrl":null,"url":null,"abstract":"Bioinformaticists use the Basic Local Alignment Search Tool (BLAST) to characterize an unknown sequence by comparing it against a database of known sequences, thus detecting evolutionary relationships and biological properties. mpiBLAST is a widely-used, high-performance, open-source parallelization of BLAST that runs on a computer cluster delivering super-linear speedups. However, the Achilles heel of mpiBLAST is its lack of modularity, thus adversely affecting maintainability and extensibility. Alleviating this shortcoming requires an architectural refactoring to improve maintenance and extensibility while preserving high performance. Toward that end, this paper evaluates five different software architectures and details how each satisfies our design objectives. In addition, we introduce a novel approach to using mixin layers to enable mixing-and-matching of modules in constructing sequence-search applications for a variety of high-performance computing systems. Our design, which we call \"mixin layers with refined roles\", utilizes mixin layers to separate functionality into complementary modules and the refined roles in each layer improve the inherently modular design by precipitating flexible and structured parallel development, a necessity for an open-source application. We believe that this new software architecture for mpiBLAST-2.0 will benefit both the users and developers of the package and that our evaluation of different software architectures will be of value to other software engineers faced with the challenges of creating maintainable and extensible, high-performance, bioinformatics software.","PeriodicalId":263470,"journal":{"name":"2007 IEEE International Conference on Software Maintenance","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Conference on Software Maintenance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSM.2007.4362627","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Bioinformaticists use the Basic Local Alignment Search Tool (BLAST) to characterize an unknown sequence by comparing it against a database of known sequences, thus detecting evolutionary relationships and biological properties. mpiBLAST is a widely-used, high-performance, open-source parallelization of BLAST that runs on a computer cluster delivering super-linear speedups. However, the Achilles heel of mpiBLAST is its lack of modularity, thus adversely affecting maintainability and extensibility. Alleviating this shortcoming requires an architectural refactoring to improve maintenance and extensibility while preserving high performance. Toward that end, this paper evaluates five different software architectures and details how each satisfies our design objectives. In addition, we introduce a novel approach to using mixin layers to enable mixing-and-matching of modules in constructing sequence-search applications for a variety of high-performance computing systems. Our design, which we call "mixin layers with refined roles", utilizes mixin layers to separate functionality into complementary modules and the refined roles in each layer improve the inherently modular design by precipitating flexible and structured parallel development, a necessity for an open-source application. We believe that this new software architecture for mpiBLAST-2.0 will benefit both the users and developers of the package and that our evaluation of different software architectures will be of value to other software engineers faced with the challenges of creating maintainable and extensible, high-performance, bioinformatics software.
一种可维护的快速模块化生物信息学序列搜索软件体系结构
生物信息学家使用基本局部比对搜索工具(BLAST)通过将未知序列与已知序列的数据库进行比较来确定未知序列的特征,从而检测进化关系和生物学特性。mpiBLAST是一种广泛使用的、高性能的、开源的并行化BLAST,它运行在计算机集群上,提供超线性的加速。然而,mpiBLAST的致命弱点是缺乏模块化,从而对可维护性和可扩展性产生不利影响。要缓解这个缺点,需要进行架构重构,以在保持高性能的同时改进维护和可扩展性。为此,本文评估了五种不同的软件架构,并详细说明了每种架构如何满足我们的设计目标。此外,我们还介绍了一种使用混合层的新方法,以便在为各种高性能计算系统构建序列搜索应用程序时实现模块的混合和匹配。我们的设计,我们称之为“混合层与精炼角色”,利用混合层将功能分离到互补模块中,每个层中的精炼角色通过促成灵活和结构化的并行开发来改进固有的模块化设计,这是开源应用程序的必要条件。我们相信,mpiBLAST-2.0的新软件架构将使软件包的用户和开发人员都受益,我们对不同软件架构的评估将对其他面临创建可维护和可扩展的高性能生物信息学软件挑战的软件工程师有价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信