Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000最新文献

筛选
英文 中文
Template based structured collections 基于模板的结构化集合
J. Nolte, M. Sato, Y. Ishikawa
{"title":"Template based structured collections","authors":"J. Nolte, M. Sato, Y. Ishikawa","doi":"10.1109/IPDPS.2000.846025","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846025","url":null,"abstract":"Collective operations on distributed data sets foster a high-level data-parallel programming style that eases many aspects of parallel programming significantly. In this paper we describe how higher-order collective operations on distributed object sets can be introduced in a structured way by means of reusable topology classes and C++ templates.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115257214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Buffered coscheduling: a new methodology for multitasking parallel jobs on distributed systems 缓冲协同调度:分布式系统上多任务并行作业的新方法
F. Petrini, Wu-chun Feng
{"title":"Buffered coscheduling: a new methodology for multitasking parallel jobs on distributed systems","authors":"F. Petrini, Wu-chun Feng","doi":"10.1109/IPDPS.2000.846019","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846019","url":null,"abstract":"Buffered coscheduling is a scheduling methodology for time-sharing communicating processes in parallel and distributed systems. The methodology has two primary features: communication buffering and strobing. With communication buffering, communication generated by each processor is buffered and performed at the end of regular intervals to amortize communication and scheduling overhead. This infrastructure is then leveraged by a strobing mechanism to perform a total exchange of information at the end of each interval, thus providing global information to more efficiently schedule communicating processes. This paper describes how buffered coscheduling can optimize resource utilization by analyzing workloads with varying computational granularities, load imbalances, and communication patterns. The experimental results, performed using a detailed simulation model, show that buffered coscheduling is very effective on fast SANs such as Myrinet as well as slower switch-based LANs.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116418706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
A multi-tier RAID storage system with RAID1 and RAID5 采用RAID1和RAID5配置的多层RAID存储系统
Nitin Muppalaneni, K. Gopinath
{"title":"A multi-tier RAID storage system with RAID1 and RAID5","authors":"Nitin Muppalaneni, K. Gopinath","doi":"10.1109/IPDPS.2000.846051","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846051","url":null,"abstract":"Redundant Arrays of Inexpensive Disks (RAID) is a popular technique used to improve the reliability and performance of secondary storage. Of various levels of RAID discussed, RAID1 and RAID5 have become more popular. Mirroring or RAID1 maintains multiple copies of the data, generally provides best performance and is easier to configure. Rotating parity scheme or RAID5 is the least expensive RAID scheme with good large update performance. It suffers from poor small update performance and performance drops sharply when a diskfails and the array enters degraded mode. Configuring RAID5 is more involved. This paper presents the design and implementation of a host-based driver for a multi-tier RAID storage system, currently with 2 tiers: a small RAID1 tier and a larger RAID5 tier. Based on access patterns, the driver automatically migrates frequently accessed data to RAID1 while demoting not so frequently accessed data to RAID5. The prototype provides reliable persistence semantics for data migration between the tiers using ordered updates. Mechanisms are separated from policies through an API so that any desired policy can be implemented in trusted user processes. Finally, we present comparison of the performance of our system with comparable systems using striping and RAID5.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128677981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Consensus based on failure detectors with a perpetual accuracy property 基于具有永久精度特性的故障检测器的一致性
A. Mostéfaoui, M. Raynal
{"title":"Consensus based on failure detectors with a perpetual accuracy property","authors":"A. Mostéfaoui, M. Raynal","doi":"10.1109/IPDPS.2000.846029","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846029","url":null,"abstract":"This paper is on the Consensus problem, in the context of asynchronous distributed systems made of n processes, at most f of them may crash. A family of failure detector classes satisfying a Perpetual Accuracy property is first defined. This family includes the failure detector class S (the class of Strong failure detectors defined by Chandra and Toueg) central to the definition of a class (S/sub x/) where x is the minimum number (x/spl ges/1) of correct processes that can never be suspected to have crashed Then, a protocol that solves the Consensus problem is given. This protocol works with any failure detector class (S/sub x/) of this family. It is particularly simple and uses a Reliable Broadcast protocol as a skeleton. It requires n-x+1 communication steps, and its communication bit complexity is (n-x+1)(n-1)|/spl nu/| (where |/spl nu/| is the maximal size of an initial value a process can propose).","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126125791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Using switch directories to speed up cache-to-cache transfers in CC-NUMA multiprocessors 在CC-NUMA多处理器中使用交换目录加速缓存到缓存的传输
R. Iyer, L. Bhuyan, Ashwini K. Nanda
{"title":"Using switch directories to speed up cache-to-cache transfers in CC-NUMA multiprocessors","authors":"R. Iyer, L. Bhuyan, Ashwini K. Nanda","doi":"10.1109/IPDPS.2000.846057","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846057","url":null,"abstract":"In this paper we propose a novel hardware caching technique, called switch directory, to reduce the communication latency in CC-NUMA multiprocessors. The main idea is to implement small fast directory caches in crossbar switches of the inter-connect medium to capture and store ownership information as the data flows from the memory module to the requesting processor. Using the stored information, the switch directory re-routes subsequent requests to dirty blocks directly to the owner cache, thus reducing the latency for home node processing such as slow DRAM directory access and coherence controller occupancies. The design and implementation details of a DiRectory Embedded Switch ARchitecture; DRESAR, are presented. We explore the performance benefits of switch directories by modeling DRESAR in a detailed execution driven simulator. Our results show that the switch directories can improve performance by up to 60% reduction in home node cache-to-cache transfers for several scientific applications and commercial workloads.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116784101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
ACDS: Adapting computational data streams for high performance ACDS:调整计算数据流以获得高性能
Carsten Isert, K. Schwan
{"title":"ACDS: Adapting computational data streams for high performance","authors":"Carsten Isert, K. Schwan","doi":"10.1109/IPDPS.2000.846046","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846046","url":null,"abstract":"Data-intensive, interactive applications are an important class of metacomputing (Grid) applications. They are characterized by large, time-varying data flows between data providers and consumers. The topic of this paper is the runtime adaptation of data streams, in response to changes in resource availability and/or in end user requirements, with the goal of continually providing to consumers data at the levels of quality they require. Our approach is one that associates computational objects with data streams. Runtime adaptation is achieved by adjusting objects' actions on streams, by splitting and merging objects, and by migrating them (and the streams on which they operate) across machines and network links. Adaptive streams also react to changes in resource availability detected by online monitoring.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131902138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 68
Gang scheduling with memory considerations 考虑内存的组调度
Anat Batat, D. Feitelson
{"title":"Gang scheduling with memory considerations","authors":"Anat Batat, D. Feitelson","doi":"10.1109/IPDPS.2000.845971","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.845971","url":null,"abstract":"A major problem with time slicing on parallel machines is memory pressure, as the resulting paging activity damages the synchronism among a job's processes. An alternative is to impose admission controls, and only admit jobs that fit into the available memory. Despite suffering from delayed execution, this leads to better overall performance by preventing the harmful effects of paging and thrashing.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132028015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 102
A provably optimal, distribution-independent parallel fast multipole method 一种可证明的最优、与分布无关的并行快速多极子方法
F. E. Sevilgen, N. Futamura, S. Aluru
{"title":"A provably optimal, distribution-independent parallel fast multipole method","authors":"F. E. Sevilgen, N. Futamura, S. Aluru","doi":"10.1109/IPDPS.2000.845967","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.845967","url":null,"abstract":"The Fast Multipole Method (FMM) is a robust technique for the rapid evaluation of the combined effect of pairwise interactions of n data sources. Parallel computation of the FMM is considered a challenging problem due to the dependence of the computation on the distribution of the data sources, usually resulting in dynamic data decomposition and load balancing problems. In this paper, we present the first provably efficient and distribution-independent parallel algorithm for the FMM on distributed memory parallel computers. Our algorithm does not require any dynamic data decomposition or load balancing step. We present our algorithm in terms of a few basic and well understood primitive operations such as sorting and parallel prefix.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115830527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Applying interposition techniques for performance analysis of OPENMP parallel applications 应用插入技术对OPENMP并行应用程序进行性能分析
Marc González, Albert Serra, X. Martorell, J. Oliver, E. Ayguadé, Jesús Labarta, N. Navarro
{"title":"Applying interposition techniques for performance analysis of OPENMP parallel applications","authors":"Marc González, Albert Serra, X. Martorell, J. Oliver, E. Ayguadé, Jesús Labarta, N. Navarro","doi":"10.1109/IPDPS.2000.845990","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.845990","url":null,"abstract":"Tuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may arise. We believe that an exhaustive and time-aware tracing at a fine-grain level is essential to capture this kind of situations. This paper presents a tracing mechanism based on dynamic code interposition, and compares it with the usual compiler-directed code injection. Dynamic code interposition adds monitoring code at run-time to unmodified binaries and shared libraries, making it suitable for environments in which the compiler or the available tools do not offer instrumentation facilities. Static injection and dynamic interposition techniques are used to collect detailed traces that feed an analysis tool. Both environments meet the accuracy and performance goals required to profile and analyze parallel applications and runtime libraries.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130867326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Fault-tolerant wormhole routing algorithms in meshes in the presence of concave faults 网格中存在凹故障时的容错虫洞路由算法
Seungjin Park, Jong-Hoon Youn, B. Bose
{"title":"Fault-tolerant wormhole routing algorithms in meshes in the presence of concave faults","authors":"Seungjin Park, Jong-Hoon Youn, B. Bose","doi":"10.1109/IPDPS.2000.846045","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846045","url":null,"abstract":"A fault ring is a connection of only nonfaulty adjacent nodes and links such that the interior of the ring contains only faulty components. This paper proposes two wormhole routing algorithms that deal with more relaxed shapes of fault rings than previously known algorithms in the mesh networks. As a result, the number of components to be made disabled would be reduced considerably in some cases. First algorithm, called F4, uses four virtual channels and allows all four sides of fault rings to contain concave shapes. Second algorithm, F3, permits up to three sides to contain concave shapes using only three virtual channels. Both F3 and F4 are free of deadlock and livelock and guarantee the delivery of messages between any pair of nonfaulty and connected nodes in the network.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123842243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信