U. Srinivasan, Peng-Sheng Chen, Q. Diao, C. Lim, E. Li, Yongjian Chen, R. Ju, Yimin Zhang
{"title":"Characterization and analysis of HMMER and SVM-RFE parallel bioinformatics applications","authors":"U. Srinivasan, Peng-Sheng Chen, Q. Diao, C. Lim, E. Li, Yongjian Chen, R. Ju, Yimin Zhang","doi":"10.1109/IISWC.2005.1526004","DOIUrl":null,"url":null,"abstract":"Bioinformatics applications constitute an emerging data-intensive, high-performance computing (HPC) domain. While there is much research on algorithmic improvements, (2004), the actual performance of an application also depends on how well the program maps to the target hardware. This paper presents a performance study of two parallel bioinformatics applications HMMER (sequence alignment) and SVM-RFE (gene expression analysis), on Intel x86 based hyperthread-capable (2002) shared-memory multiprocessor systems. The performance characteristics varied according to the application and target hardware characteristics. For instance, HMMER is compute intensive and showed better scalability on a 3.0 GHz system versus a 2.2 GHz system. However, SVM-RFE is memory intensive and showed better absolute performance on the 2.2 GHz machine which has better memory bandwidth. The performance is also impacted by processor features, e.g. hyperthreading (HT) (2002) and prefetching. With HMMER we could obtain -75% of the performance with HT enabled with respect to doubling the number of CPUs. While load balancing optimizations can provide speedup of -30% for HMMER on a hyperthreading-enabled system, the load balancing has to adapt to the target number of processors and threads. SVM-RFE benefits differently from the same load-balancing and thread scheduling tuning. We conclude that compiler and runtime optimizations play an important role to achieve the best performance for a given bioinformatics algorithm.","PeriodicalId":275514,"journal":{"name":"IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005.","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISWC.2005.1526004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Bioinformatics applications constitute an emerging data-intensive, high-performance computing (HPC) domain. While there is much research on algorithmic improvements, (2004), the actual performance of an application also depends on how well the program maps to the target hardware. This paper presents a performance study of two parallel bioinformatics applications HMMER (sequence alignment) and SVM-RFE (gene expression analysis), on Intel x86 based hyperthread-capable (2002) shared-memory multiprocessor systems. The performance characteristics varied according to the application and target hardware characteristics. For instance, HMMER is compute intensive and showed better scalability on a 3.0 GHz system versus a 2.2 GHz system. However, SVM-RFE is memory intensive and showed better absolute performance on the 2.2 GHz machine which has better memory bandwidth. The performance is also impacted by processor features, e.g. hyperthreading (HT) (2002) and prefetching. With HMMER we could obtain -75% of the performance with HT enabled with respect to doubling the number of CPUs. While load balancing optimizations can provide speedup of -30% for HMMER on a hyperthreading-enabled system, the load balancing has to adapt to the target number of processors and threads. SVM-RFE benefits differently from the same load-balancing and thread scheduling tuning. We conclude that compiler and runtime optimizations play an important role to achieve the best performance for a given bioinformatics algorithm.