病原体诊断检测设计的生物信息学管道并行实现

R. Satya, Kamal Kumar, N. Zavaljevski, J. Reifman
{"title":"病原体诊断检测设计的生物信息学管道并行实现","authors":"R. Satya, Kamal Kumar, N. Zavaljevski, J. Reifman","doi":"10.1109/HPCMP-UGC.2009.36","DOIUrl":null,"url":null,"abstract":"The genomes of hundreds of pathogens and their near neighbors are now available and many more are being sequenced. With the availability of this genome information, sequence-based pathogen identification has become an increasingly important tool for clinical diagnostics and environmental monitoring of biological threat agents. Chief among sequence-based identification tools are DNA microarrays, which have the ability to test for thousands of pathogens in a single diagnostic test. The design of microarray diagnostic assays involves the identification of short DNA sequences unique to a pathogen or groups of pathogens, where these unique sequences, or “fingerprints” (also referred to as probes) are used to identify the pathogens. To design pathogen fingerprints, we developed TOFI (Tool for Oligonucleotide Fingerprint Identification), a high performance computing software pipeline that designs microarray probes for multiple related pathogens in a single run. The TOFI pipeline is extremely efficient in designing microarray fingerprints for multiple pathogens. Parallel implementation of computationally expensive specificity analysis of the designed fingerprints drastically reduces the overall execution time of the software. Comprehensive performance analysis shows that TOFI achieves super-linear speedup for up to 74 processors. A Web-based user interface, developed using the User Interface Toolkit, provides easy access to the pipeline. Using 74 processors, TOFI took approximately nine hours to design 5,015 in-silico probes for eight Burkholderia genomes with a combined size of more than 50 million base pairs. Experimental validation of these probes with various Burkholderia genomes showed that nearly 80% of the designed fingerprints identify the intended targets.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Parallel Implementation of a Bioinformatics Pipeline for the Design of Pathogen Diagnostic Assays\",\"authors\":\"R. Satya, Kamal Kumar, N. Zavaljevski, J. Reifman\",\"doi\":\"10.1109/HPCMP-UGC.2009.36\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The genomes of hundreds of pathogens and their near neighbors are now available and many more are being sequenced. With the availability of this genome information, sequence-based pathogen identification has become an increasingly important tool for clinical diagnostics and environmental monitoring of biological threat agents. Chief among sequence-based identification tools are DNA microarrays, which have the ability to test for thousands of pathogens in a single diagnostic test. The design of microarray diagnostic assays involves the identification of short DNA sequences unique to a pathogen or groups of pathogens, where these unique sequences, or “fingerprints” (also referred to as probes) are used to identify the pathogens. To design pathogen fingerprints, we developed TOFI (Tool for Oligonucleotide Fingerprint Identification), a high performance computing software pipeline that designs microarray probes for multiple related pathogens in a single run. The TOFI pipeline is extremely efficient in designing microarray fingerprints for multiple pathogens. Parallel implementation of computationally expensive specificity analysis of the designed fingerprints drastically reduces the overall execution time of the software. Comprehensive performance analysis shows that TOFI achieves super-linear speedup for up to 74 processors. A Web-based user interface, developed using the User Interface Toolkit, provides easy access to the pipeline. Using 74 processors, TOFI took approximately nine hours to design 5,015 in-silico probes for eight Burkholderia genomes with a combined size of more than 50 million base pairs. Experimental validation of these probes with various Burkholderia genomes showed that nearly 80% of the designed fingerprints identify the intended targets.\",\"PeriodicalId\":268639,\"journal\":{\"name\":\"2009 DoD High Performance Computing Modernization Program Users Group Conference\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 DoD High Performance Computing Modernization Program Users Group Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCMP-UGC.2009.36\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 DoD High Performance Computing Modernization Program Users Group Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCMP-UGC.2009.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

数百种病原体及其近邻的基因组现在已经可用,更多的正在测序中。随着这些基因组信息的可用性,基于序列的病原体鉴定已成为生物威胁因子临床诊断和环境监测的越来越重要的工具。在基于序列的鉴定工具中,最主要的是DNA微阵列,它能够在一次诊断测试中检测数千种病原体。微阵列诊断检测的设计涉及识别病原体或病原体群特有的短DNA序列,其中这些独特序列或“指纹”(也称为探针)用于识别病原体。为了设计病原体指纹,我们开发了TOFI (Tool for Oligonucleotide Fingerprint Identification),这是一个高性能的计算软件流水线,可以一次设计多个相关病原体的微阵列探针。TOFI流水线在设计多种病原体的微阵列指纹图谱方面非常有效。设计指纹特异性分析的并行实现计算成本高,大大减少了软件的总体执行时间。综合性能分析表明,TOFI在多达74个处理器上实现了超线性加速。使用用户界面工具包开发的基于web的用户界面可以方便地访问管道。使用74个处理器,TOFI花了大约9个小时设计了5015个硅探针,用于8个伯克霍尔德氏菌基因组,总大小超过5000万个碱基对。这些探针在不同伯克霍尔德氏菌基因组上的实验验证表明,近80%的设计指纹识别了预期的目标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Parallel Implementation of a Bioinformatics Pipeline for the Design of Pathogen Diagnostic Assays
The genomes of hundreds of pathogens and their near neighbors are now available and many more are being sequenced. With the availability of this genome information, sequence-based pathogen identification has become an increasingly important tool for clinical diagnostics and environmental monitoring of biological threat agents. Chief among sequence-based identification tools are DNA microarrays, which have the ability to test for thousands of pathogens in a single diagnostic test. The design of microarray diagnostic assays involves the identification of short DNA sequences unique to a pathogen or groups of pathogens, where these unique sequences, or “fingerprints” (also referred to as probes) are used to identify the pathogens. To design pathogen fingerprints, we developed TOFI (Tool for Oligonucleotide Fingerprint Identification), a high performance computing software pipeline that designs microarray probes for multiple related pathogens in a single run. The TOFI pipeline is extremely efficient in designing microarray fingerprints for multiple pathogens. Parallel implementation of computationally expensive specificity analysis of the designed fingerprints drastically reduces the overall execution time of the software. Comprehensive performance analysis shows that TOFI achieves super-linear speedup for up to 74 processors. A Web-based user interface, developed using the User Interface Toolkit, provides easy access to the pipeline. Using 74 processors, TOFI took approximately nine hours to design 5,015 in-silico probes for eight Burkholderia genomes with a combined size of more than 50 million base pairs. Experimental validation of these probes with various Burkholderia genomes showed that nearly 80% of the designed fingerprints identify the intended targets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信