2008 IEEE International Symposium on Parallel and Distributed Processing最新文献

筛选
英文 中文
Component labeling for k-concave binary images using an FPGA 基于FPGA的k凹二值图像的分量标记
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536129
Yasuaki Ito, K. Nakano
{"title":"Component labeling for k-concave binary images using an FPGA","authors":"Yasuaki Ito, K. Nakano","doi":"10.1109/IPDPS.2008.4536129","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536129","url":null,"abstract":"Connected component labeling is a task that assigns unique IDs to the connected components of a binary image. The main contribution of this paper is to present a hardware connected component labeling algorithm for k-concave binary images designed and implemented in FPGA. Pixels of a binary image are given to the FPGA in raster order, and the resulting labels are also output in the same order. The advantage of our labeling algorithm is small latency and to use a small internal storage of the FPGA. We have implemented our hardware labeling algorithm in an Altera Stratix Family FPGA, and evaluated the performance. The implementation result shows that for a 10-concave binary image of 2048 times 2048, our connected component labeling algorithm runs in approximately 70 ms and its latency is approximately 750 ns.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126781828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Design and implementation of a tool for modeling and programming deadlock free meta-pipeline applications 设计和实现一个工具,用于建模和编程无死锁的元管道应用程序
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536121
S. Yamagiwa, L. Sousa
{"title":"Design and implementation of a tool for modeling and programming deadlock free meta-pipeline applications","authors":"S. Yamagiwa, L. Sousa","doi":"10.1109/IPDPS.2008.4536121","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536121","url":null,"abstract":"The Caravela platform has been designed to develop a parallel and distributed stream-based computing paradigm, namely supported on the pipeline processing approach herein designated by meta-pipeline. This paper is focused on the design and implementation of a modeling tool for the meta-pipeline, namely to tackle the deadlock problem due to uninitialized input data stream in a pipeline-model. A new efficient algorithm is proposed to prevent deadlock situations by detecting uninitialized edges in a pipeline graph. The algorithm identifies the cyclic paths in a pipeline-graph and builds a reduced list with only the true cyclic paths that have to be really initialized. Further optimization techniques are also proposed to reduce the computation time and the required amount of memory. Moreover, this paper also presents a Graphical User Interface (GUI) for easy programming meta-pipeline applications, which provides an automatic validation procedure based on the proposed algorithm. Experimental results presented in this paper show the effectiveness of both the proposed algorithm and the developed GUI.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126853190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A dynamic scheduling approach for coordinated wide-area data transfers using GridFTP 基于GridFTP的协调广域数据传输的动态调度方法
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536325
Gaurav Khanna, Ümit V. Çatalyürek, T. Kurç, R. Kettimuthu, P. Sadayappan, J. Saltz
{"title":"A dynamic scheduling approach for coordinated wide-area data transfers using GridFTP","authors":"Gaurav Khanna, Ümit V. Çatalyürek, T. Kurç, R. Kettimuthu, P. Sadayappan, J. Saltz","doi":"10.1109/IPDPS.2008.4536325","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536325","url":null,"abstract":"Many scientific applications need to stage large volumes of files from one set of machines to another set of machines in a wide-area network. Efficient execution of such data transfers needs to take into account the heterogeneous nature of the environment and dynamic availability of shared resources. This paper proposes an algorithm that dynamically schedules a batch of data transfer requests with the goal of minimizing the overall transfer time. The proposed algorithm performs simultaneous transfer of chunks of files from multiple file replicas, if the replicas exist. Adaptive replica selection is employed to transfer different chunks of the same file by taking dynamically changing network band- widths into account. We utilize GridFTP as the underlying mechanism for data transfers. The algorithm makes use of information from past GridFTP transfers to estimate network bandwidths and resource availability. The efficiency of the algorithm is evaluated on a wide-area testbed.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123414333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
A rapid prototyping environment for high-speed reconfigurable analog signal processing 用于高速可重构模拟信号处理的快速原型环境
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536511
J. Becker, F. Henrici, S. Trendelenburg, Y. Manoli
{"title":"A rapid prototyping environment for high-speed reconfigurable analog signal processing","authors":"J. Becker, F. Henrici, S. Trendelenburg, Y. Manoli","doi":"10.1109/IPDPS.2008.4536511","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536511","url":null,"abstract":"This paper reports on a rapid-prototyping platform for high-frequency continuous-time analog filters to be used in communication front-ends. A field programmable analog array (FPAA) is presented, which implements a unique hexagonal topology of 55 tunable OTAs for reconfigurable instantiation of Gm-C filters in a 0.13 mum CMOS technology. It is the first analog array to achieve a bandwidth, which allows processing of intermediate frequencies used in communication systems. In addition to the intuitive manual mapping of analog filters to the chip structure, a genetic algorithm with hardware in the loop is used for automated synthesis of transfer functions.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126474001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Runtime adaptive multi-processor system-on-chip: RAMPSoC 运行时自适应多处理器片上系统
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/ICSAMOS.2010.5642043
D. Göhringer, M. Hübner, Laure Hugot-Derville, J. Becker
{"title":"Runtime adaptive multi-processor system-on-chip: RAMPSoC","authors":"D. Göhringer, M. Hübner, Laure Hugot-Derville, J. Becker","doi":"10.1109/ICSAMOS.2010.5642043","DOIUrl":"https://doi.org/10.1109/ICSAMOS.2010.5642043","url":null,"abstract":"Current trends in high performance computing show, that the usage of multiprocessor systems on chip are one approach for the requirements of computing intensive applications. The multiprocessor system on chip (MPSoC) approaches often provide a static and homogeneous infrastructure of networked microprocessor on the chip die. A novel idea in this research area is to introduce the dynamic adaptivity of reconfigurable hardware in order to provide a flexible heterogeneous set of processing elements during run-time. This extension of the MPSoC idea by introducing run-time reconfiguration delivers a new degree of freedom for system design as well as for the optimized distribution of computing tasks to the adapted processing cells on the architecture related to the changing application requirements. The \"computing in time and space\"paradigm and the extension with the new degree of freedom for MPSoCs will be presented with the RAMPSoC approach described in this paper.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126622838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
Low power/area branch prediction using complementary branch predictors 使用互补支路预测器进行低功耗/面积支路预测
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536323
Resit Sendag, J. Yi, Peng-fei Chuang, D. Lilja
{"title":"Low power/area branch prediction using complementary branch predictors","authors":"Resit Sendag, J. Yi, Peng-fei Chuang, D. Lilja","doi":"10.1109/IPDPS.2008.4536323","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536323","url":null,"abstract":"Although high branch prediction accuracy is necessary for high performance, it typically comes at the cost of larger predictor tables and/or more complex prediction algorithms. Unfortunately, large predictor tables and complex algorithms require more chip area and have higher power consumption, which precludes their use in embedded processors. As an alternative to large, complex branch predictors, in this paper, we investigate adding complementary branch predictors (CBP) to embedded processors to reduce their power consumption and/or improve their branch prediction accuracy. A CBP differs from a conventional branch predictor in that it focuses only on frequently mispredicted branches while letting the conventional branch predictor predict the more predictable ones. Our results show that adding a small 16-entry (28 byte) CBP reduces the branch misprediction rate of static, bimodal, and gshare branch predictors by an average of 51.0%, 42.5%, and 39.8%, respectively, across 38 SPEC 2000 and MiBench benchmarks. Furthermore, a 256-entry CBP improves the energy-efficiency of the branch predictor and processor up to 97.8% and 23.6%, respectively. Finally, in addition to being very energy-efficient, a CBP can also improve the processor performance and, due to its simplicity, can be easily added to the pipeline of any processor.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"57 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114124054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A neocortex model implementation on reconfigurable logic with streaming memory 基于流存储器的可重构逻辑的新皮层模型实现
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536533
Christopher N. Vutsinas, T. Taha, Kenneth L. Rice
{"title":"A neocortex model implementation on reconfigurable logic with streaming memory","authors":"Christopher N. Vutsinas, T. Taha, Kenneth L. Rice","doi":"10.1109/IPDPS.2008.4536533","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536533","url":null,"abstract":"In this paper we study the acceleration of a new class of cognitive processing applications based on the structure of the neocortex. Our focus is on a model of the visual cortex used for image recognition developed by George and Hawkins. We propose techniques to accelerate the algorithm using reconfigurable logic, specifically a streaming memory architecture utilizing available off-chip memory. We discuss the design of a streaming memory access unit enabling a large number of processing elements to be placed on a single FPGA thus increasing throughput. We present an implementation of our approach on a Cray XD1 and discuss possible extension to further increase throughput. Our results indicate that using a two FPGA design with streaming memory gives a speedup of 71.9 times over a purely software implementation.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114131463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Adaptive B-Greedy (ABG): A simple yet efficient scheduling algorithm 自适应b -贪婪(ABG):一种简单而高效的调度算法
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536546
Hongyang Sun, W. Hsu
{"title":"Adaptive B-Greedy (ABG): A simple yet efficient scheduling algorithm","authors":"Hongyang Sun, W. Hsu","doi":"10.1109/IPDPS.2008.4536546","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536546","url":null,"abstract":"In order to improve processor utilizations on parallel systems, adaptive scheduling with parallelism feedback was recently proposed. A-Greedy, an existing adaptive scheduler, offers provably-good job execution time and processor utilization. Unfortunately, it suffers from unstable feedback and hence unnecessary processor reallocations even when the job has constant parallelism. This problem may cause difficulties in the management of system resources. We propose a new adaptive scheduler called ABG (for Adaptive B-Greedy), which ensures both performance and stability. In a direct comparison with A-Greedy using simulated data- parallel jobs, ABG shows an average 50% reduction in wasted processor cycles and an average 20% improvement in running time. For a set of jobs, ABG also outperforms A-Greedy by 10% to 15% on average in terms of both makespan and mean response time, provided the system is not heavily loaded. Our detailed analysis shows that ABG indeed offers improved transient and steady-state behaviors in terms of control-theoretic metrics. Using trim analysis, we show that ABG provides nearly linear speedup for individual jobs and good processor utilizations. Using competitive analysis, we also show that ABG offers good makespan and mean response time bounds.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125212083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Defining a simple metric for real-time security level evaluation of multi-sites networks 为多站点网络的实时安全级别评估定义了一个简单的度量
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536562
Abdoul Karim Ganame, J. Bourgeois
{"title":"Defining a simple metric for real-time security level evaluation of multi-sites networks","authors":"Abdoul Karim Ganame, J. Bourgeois","doi":"10.1109/IPDPS.2008.4536562","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536562","url":null,"abstract":"In previous research work, we have developed a centralized security operation center (SOC) [2] and a distributed SOC [4]. These environments are very useful to react to intrusions or to analyze security problem because they provide a global view of the network without adding any kinds of software on network components. They therefore lack the possibility to have a real-time metric which measures the security health of the different sites. The idea is to have, in one look, an indication of the security level of all the sites of the network. In this article, we propose to define such a metric which gives the user 3 states for a given network.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122465708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A parallel insular model for location areas planning in mobile networks 移动网络中定位区域规划的平行孤岛模型
2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI: 10.1109/IPDPS.2008.4536367
Laidi Foughali, E. Talbi, M. Batouche
{"title":"A parallel insular model for location areas planning in mobile networks","authors":"Laidi Foughali, E. Talbi, M. Batouche","doi":"10.1109/IPDPS.2008.4536367","DOIUrl":"https://doi.org/10.1109/IPDPS.2008.4536367","url":null,"abstract":"The main interest of this paper is the optimization of the location areas planning in cellular radio networks. It is well known that the quality of service in mobile networks depends on many parameters, among them an optimal location area planning. Furthermore, it is more interesting to provide a logical organization for the already deployed networks. In this paper, we propose the use of heuristics strategies and hybrid metaheuristics strategies to solve the location areas planning problem. The latter is formulated as a constrained planar graph partitioning problem by using a mathematical model which is based on a very realistic specification. Heuristics strategies are based on greedy algorithms while hybrid metaheuristics are based on genetic algorithms. New genetic operators have been designed to this specific problem. Moreover, parallel approaches have been proposed to improve the quality of solutions and speedup the search. Results obtained on real-life benchmarks show the effectiveness of the developed optimization algorithms.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"353 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122785928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信