2008 IEEE International Symposium on Parallel and Distributed Processing最新文献

筛选
英文 中文
A bandwidth optimized SDRAM controller for the MORPHEUS reconfigurable architecture 一种用于MORPHEUS可重构架构的带宽优化SDRAM控制器
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536536
Sean Whitty, R. Ernst
{"title":"A bandwidth optimized SDRAM controller for the MORPHEUS reconfigurable architecture","authors":"Sean Whitty, R. Ernst","doi":"10.1109/IPDPS.2008.4536536","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536536","url":null,"abstract":"High-end applications designed for the MORPHEUS computing platform require a massive amount of memory and memory throughput to fully demonstrate MORPHEUS's potential as a high-performance reconfigurable architecture. For example, a proposed film grain noise reduction application for high definition video, which is composed of multiple image processing tasks, requires enormous data rates due to its large input image size and real-time processing constraints. To meet these requirements and to eliminate external memory bottlenecks, a bandwidth- optimized DDR-SDRAM memory controller has been designed for use with the MORPHEUS platform and its Network On Chip interconnect. This paper describes the controller's design requirements and architecture, including the interface to the Network On Chip and the two-stage memory access scheduler, and presents relevant experiments and performance figures.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122266300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
SNAP, Small-world Network Analysis and Partitioning: An open-source parallel graph framework for the exploration of large-scale networks SNAP,小世界网络分析和划分:用于探索大规模网络的开源并行图框架
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536261
David A. Bader, Kamesh Madduri
{"title":"SNAP, Small-world Network Analysis and Partitioning: An open-source parallel graph framework for the exploration of large-scale networks","authors":"David A. Bader, Kamesh Madduri","doi":"10.1109/IPDPS.2008.4536261","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536261","url":null,"abstract":"We present SNAP (small-world network analysis and partitioning), an open-source graph framework for exploratory study and partitioning of large-scale networks. To illustrate the capability of SNAP, we discuss the design, implementation, and performance of three novel parallel community detection algorithms that optimize modularity, a popular measure for clustering quality in social network analysis. In order to achieve scalable parallel performance, we exploit typical network characteristics of small-world networks, such as the low graph diameter, sparse connectivity, and skewed degree distribution. We conduct an extensive experimental study on real-world graph instances and demonstrate that our parallel schemes, coupled with aggressive algorithm engineering for small-world networks, give significant running time improvements over existing modularity-based clustering heuristics, with little or no loss in clustering quality. For instance, our divisive clustering approach based on approximate edge betweenness centrality is more than two orders of magnitude faster than a competing greedy approach, for a variety of large graph instances on the Sun Fire T2000 multicore system. SNAP also contains parallel implementations of fundamental graph-theoretic kernels and topological analysis metrics (e.g., breadth-first search, connected components, vertex and edge centrality) that are optimized for small- world networks. The SNAP framework is extensible; the graph kernels are modular, portable across shared memory multicore and symmetric multiprocessor systems, and simplify the design of high-level domain-specific applications.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115808475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 144
A space- and time-efficient hash table hierarchically indexed by Bloom filters 一个空间和时间效率高的哈希表,由Bloom过滤器分层索引
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536275
Heeyeol Yu, R. Mahapatra
{"title":"A space- and time-efficient hash table hierarchically indexed by Bloom filters","authors":"Heeyeol Yu, R. Mahapatra","doi":"10.1109/IPDPS.2008.4536275","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536275","url":null,"abstract":"Hash tables (HTs) are poorly designed for multiple memory accesses during IP lookup and this design flow critically affects their throughput in high-speed routers. Thus, a high capacity HT with a predictable lookup throughput is desirable. A recently proposed fast HT (FHT) [20] has drawbacks like low on-chip memory utilization for a high-speed router and substantial memory overheads due to off-chip duplicate keys and pointers. Similarly, a Bloomier filter-based HT (BFHT) [13], generating an index to a key table, suffers from setup failures and static membership testing for keys. In this paper, we propose a novel hash architecture which addresses these issues by using pipelined Bloom filters. The proposed scheme, a hierarchically indexed HT (HIHT), generates indexes to a key table for the given key, so that the on-chip memory size is reduced and the overhead of pointers in a linked list is removed. Secondly, an HIHT demonstrates approximately 5.1 and 2.3 times improvement in on- chip space efficiency with at most one off-chip memory access, compared to an FHT and a BFHT, respectively. In addition to our analyses on access time and memory space, our simulation for IP lookup with 6 BGP tables shows that an HIHT exhibits 4.5 and 2.0 times on-chip memory efficiencies for 160 Gbps router than an FHT and a BFHT, respectively.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115955182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Reducing the run-time of MCMC programs by multithreading on SMP architectures 在SMP架构上通过多线程减少MCMC程序的运行时间
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536354
Jonathan M. R. Byrd, S. Jarvis, A. Bhalerao
{"title":"Reducing the run-time of MCMC programs by multithreading on SMP architectures","authors":"Jonathan M. R. Byrd, S. Jarvis, A. Bhalerao","doi":"10.1109/IPDPS.2008.4536354","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536354","url":null,"abstract":"The increasing availability of multi-core and multiprocessor architectures provides new opportunities for improving the performance of many computer simulations. Markov chain Monte Carlo (MCMC) simulations are widely used for approximate counting problems, Bayesian inference and as a means for estimating very high-dimensional integrals. As such MCMC has found a wide variety of applications infields including computational biology and physics, financial econometrics, machine learning and image processing. This paper presents a new method for reducing the run-time of Markov chain Monte Carlo simulations by using SMP machines to speculatively perform iterations in parallel, reducing the runtime of MCMC programs whilst producing statistically identical results to conventional sequential implementations. We calculate the theoretical reduction in runtime that may be achieved using our technique under perfect conditions, and test and compare the method on a selection of multi-core and multi-processor architectures. Experiments are presented that show reductions in runtime of 35% using two cores and 55% using four cores.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132218867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
DIBS: Dual interval bandwidth scheduling for short-term differentiation DIBS:用于短期差分的双间隔带宽调度
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536488
Humzah Jaffar, Xiaobo Zhou, Liqiang Zhang
{"title":"DIBS: Dual interval bandwidth scheduling for short-term differentiation","authors":"Humzah Jaffar, Xiaobo Zhou, Liqiang Zhang","doi":"10.1109/IPDPS.2008.4536488","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536488","url":null,"abstract":"Packet delay and bandwidth are two important metrics for measuring quality of service (QoS) of Internet services. While proportional delay differentiation (PDD) has been studied intensively in the context of differentiated services, few studies were conducted for per-class bandwidth differentiation. In this paper, we design and evaluate an efficient bandwidth differentiation approach. The DIBS (dual interval bandwidth scheduling) approach focuses on the short-term bandwidth differentiation of multiple classes because many Internet transactions take place in a small time frame. It does so based on the normalized instantaneous bandwidth, measured by the use of packet size and packet delay. It also proposes to use a look-back interval and a look-ahead interval to trade off differentiation accuracy and scheduling overhead. We implemented DIBS in the click modular software router. Extensive experiments have demonstrated its feasibility and effectiveness in achieving short-term bandwidth differentiation. Compared with the representative PDD algorithm WTP, DIBS can achieve better bandwidth differentiation when the inter-class packet size distributions are different. Compared with the representative weighted fair queueing algorithm PGPS, DIBS can achieve more accurate or comparable bandwidth differentiation at various workload situations, with better delay differentiation and lower cost.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130058585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A linear solver for benchmarking partitioners 基准分区的线性求解器
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536467
Kambiz Ghazinour, R. E. Shaw, E. Aubanel, L. Garey
{"title":"A linear solver for benchmarking partitioners","authors":"Kambiz Ghazinour, R. E. Shaw, E. Aubanel, L. Garey","doi":"10.1109/IPDPS.2008.4536467","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536467","url":null,"abstract":"A number of graph partitioners are currently available for solving linear systems on parallel computers. Partitioning algorithms divide the graph that arises from the linear system into a specified number of partitions such that the workload per processor is balanced and the communication between the processors is minimized. The measure of partition quality is often taken to be the number of edges cut by the partition. Ultimately the quality of a partition will be reflected in the execution time of the parallel application. In this paper, we introduce a linear solver benchmark that enables comparison of partition quality. This work also serves to motivate further work on developing benchmarks for graph partitioners.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130078625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Parallelized preprocessing algorithms for high-density oligonucleotide arrays 高密度寡核苷酸阵列的并行预处理算法
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536175
M. Schmidberger, U. Mansmann
{"title":"Parallelized preprocessing algorithms for high-density oligonucleotide arrays","authors":"M. Schmidberger, U. Mansmann","doi":"10.1109/IPDPS.2008.4536175","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536175","url":null,"abstract":"Studies of gene expression using high-density oligonucleotide microarrays have become standard in a variety of biological contexts. The data recorded using the microarray technique are characterized by high levels of noise and bias. These failures have to be removed, therefore preprocessing of raw data has been a research topic of high priority over the past few years. Actual research and computations are limited by the available computer hardware. Furthermore most of the existing preprocessing methods are very time consuming. To solve these problems, the potential of parallel computing should be used. For parallelization on multicomputers, the communication protocol MPI (message passing interface) and the R language will be used. This paper proposes the new R language package affyPara for parallelized preprocessing of high-density oligonucleotide microarray data. Partition of data could be done on arrays and therefore parallelization of algorithms gets intuitive possible. The partition of data and distribution to several nodes solves the main memory problems and accelerates the methods by up to the factor ten.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130108141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Parallel biological sequence alignments on the Cell Broadband Engine 细胞宽带引擎上的平行生物序列比对
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536328
Abhinav Sarje, S. Aluru
{"title":"Parallel biological sequence alignments on the Cell Broadband Engine","authors":"Abhinav Sarje, S. Aluru","doi":"10.1109/IPDPS.2008.4536328","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536328","url":null,"abstract":"Sequence alignment and its many variants are a fundamental tool in computational biology. There is considerable recent interest in using the cell broadband engine, a heterogenous multi-core chip that provides high performance, for biological applications. However, work so far has been limited to computing optimal alignment scores using quadratic space under the basic global/local alignment algorithm. In this paper, we present a comprehensive study of developing sequence alignment algorithms on the Cell exploiting its thread and data level parallelism features. First, we develop a cell implementation that computes optimal alignments and adopts Hirschberg's linear space technique. The former is essential as merely computing optimal alignment scores is not useful while the latter is needed to permit alignments of longer sequences. We then present cell implementations of two advanced alignment techniques - spliced alignments and syntenic alignments. In a spliced alignment, consecutive non-overlapping portions of a sequence align with ordered non-overlapping, but non-consecutive portions of another sequence. Spliced alignments are useful in aligning mRNA sequences with corresponding genomic sequences to uncover gene structure. Syntenic alignments are used to discover conserved exons and other sequences between long genomic sequences from different organisms. We present experimental results for these three types of alignments on the Cell BE and report speedups of about 4 on six SPUs on the Playstation 3, when compared to the respective best serial algorithms on the Cell BE and the Pentium 4 processor.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"2192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130116770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
A task allocation framework for biological sequence comparison applications in heterogeneous environments 异构环境下生物序列比较应用的任务分配框架
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536365
A. Boukerche, Marcelo Nardelli Pinto Santana, A. Melo
{"title":"A task allocation framework for biological sequence comparison applications in heterogeneous environments","authors":"A. Boukerche, Marcelo Nardelli Pinto Santana, A. Melo","doi":"10.1109/IPDPS.2008.4536365","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536365","url":null,"abstract":"Biological Sequence Comparison is a very important operation in computational biology since it is used to relate organisms and understand evolutionary processes. This article presents the design and evaluation of an allocation framework for biological sequence comparison applications that use dynamic programming and run in heterogeneous environments. Its goal is to determine which processors will execute the application, considering some characteristics of the heterogeneous environment, such as observed processor power and network bandwidth. The results obtained with four different task allocation policies in a 10-machine heterogeneous environment show that, for some sequence sizes, we were able to reduce the execution time of the parallel application in more than a half, when the appropriate number of processors is used.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130471416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Programming support for sensor-based scientific applications 基于传感器的科学应用程序的编程支持
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536415
N. Jiang, M. Parashar
{"title":"Programming support for sensor-based scientific applications","authors":"N. Jiang, M. Parashar","doi":"10.1109/IPDPS.2008.4536415","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536415","url":null,"abstract":"Technical advances are enabling a pervasive computational ecosystem that integrates computing infrastructures with embedded sensors and actuators, and are giving rise to a new paradigm for monitoring, understanding, and managing natural and engineered systems - one that is information/data-driven. This research investigates programming systems for sensor-driven applications. It addresses abstractions and runtime mechanisms for integrating sensor systems with computational models for scientific processes, as well as for in- network data processing, e.g., aggregation, adaptive interpolation and assimilation. The current status of this research, as well as initial results are presented.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134344213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信