Workshop on I/O in Parallel and Distributed Systems最新文献

筛选
英文 中文
Performance of the gallery parallel file system 库并行文件系统的性能
Workshop on I/O in Parallel and Distributed Systems Pub Date : 1996-05-27 DOI: 10.1145/236017.236038
N. Nieuwejaar, D. Kotz
{"title":"Performance of the gallery parallel file system","authors":"N. Nieuwejaar, D. Kotz","doi":"10.1145/236017.236038","DOIUrl":"https://doi.org/10.1145/236017.236038","url":null,"abstract":"As the 1/0 needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallelism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their 1/0 needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance 1/0 to applications that access data in patterns that have been observed to be common.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123258746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
ENWRICH: a compute-processor write caching scheme for parallel file systems ENWRICH:一种并行文件系统的计算机处理器写缓存方案
Workshop on I/O in Parallel and Distributed Systems Pub Date : 1996-05-27 DOI: 10.1145/236017.236034
A. Purakayastha, C. Ellis, D. Kotz
{"title":"ENWRICH: a compute-processor write caching scheme for parallel file systems","authors":"A. Purakayastha, C. Ellis, D. Kotz","doi":"10.1145/236017.236034","DOIUrl":"https://doi.org/10.1145/236017.236034","url":null,"abstract":"Many parallel scientific applications need high-performance I/O. Unfortunately, end-to-end parallel-I/O performance has not been able to keep up with substantial improvements in parallel-I/O hardware because of poor parallel file-system software. Many radical changes, both at the interface level and the implementation level, have recently been proposed. One such proposed interface is {em collective I/O}, which allows parallel jobs to request transfer of large contiguous objects in a single request, thereby preserving useful semantic information that would otherwise be lost if the transfer were expressed as per-processor non-contiguous requests. Kotz has proposed {em disk-directed I/O} as an efficient implementation technique for collective-I/O operations, where the compute processors make a single collective data-transfer request, and the I/O processors thereafter take full control of the actual data transfer, exploiting their detailed knowledge of the disk-layout to attain substantially improved performance. Recent parallel file-system usage studies show that writes to write-only files are a dominant part of the workload. Therefore, optimizing writes could have a significant impact on overall performance. In this paper, we propose ENWRICH, a compute-processor write-caching scheme for write-only files in parallel file systems. ENWRICH combines low-overhead write caching at the compute processors with high performance disk-directed I/O at the I/O processors to achieve both low latency and high bandwidth. This combination facilitates the use of the powerful disk-directed I/O technique independent of any particular choice of interface. By collecting writes over many files and applications, ENWRICH lets the I/O processors optimize disk I/O over a large pool of requests. We evaluate our design via simulated implementation and show that ENWRICH achieves high performance for various configurations and workloads.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122601500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Tuning the performance of I/O-intensive parallel applications 调优I/ o密集型并行应用程序的性能
Workshop on I/O in Parallel and Distributed Systems Pub Date : 1996-05-27 DOI: 10.1145/236017.236027
A. Acharya, Mustafa Uysal, R. Bennett, Assaf Mendelson, M. Beynon, J. Hollingsworth, J. Saltz, A. Sussman
{"title":"Tuning the performance of I/O-intensive parallel applications","authors":"A. Acharya, Mustafa Uysal, R. Bennett, Assaf Mendelson, M. Beynon, J. Hollingsworth, J. Saltz, A. Sussman","doi":"10.1145/236017.236027","DOIUrl":"https://doi.org/10.1145/236017.236027","url":null,"abstract":"Getting good I/O performance from parallel programs is a critical problem for many application domains. In this paper, we report our experience tuning the I/O performance of four application programs from the areas of satellite-data processing and linear algebra. After tuning, three of the four applications achieve application-level I/O rates of over 100 MB/s on 16 processors. The total volume of I/O required by the programs ranged from about 75 MB to over 200 GB. We report the lessons learned in achieving high I/O performance from these applications, including the need for code restructuring, local disks on every node and knowledge of future I/O requests. We also report our experience on achieving high performance on peer-to-peer con gurations. Finally, we comment on the necessity of complex I/O interfaces like collective I/O and strided requests to achieve high performance.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121665520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 98
Efficient data-parallel files via automatic mode detection 高效的数据并行文件通过自动模式检测
Workshop on I/O in Parallel and Distributed Systems Pub Date : 1996-05-27 DOI: 10.1145/236017.236025
J. Moore, P. Hatcher, M. J. Quinn
{"title":"Efficient data-parallel files via automatic mode detection","authors":"J. Moore, P. Hatcher, M. J. Quinn","doi":"10.1145/236017.236025","DOIUrl":"https://doi.org/10.1145/236017.236025","url":null,"abstract":"Parallel languages rarely specify parallel I/O constructs, and existing commercial systems provide the programmer with a low-level I/O interface. We present design principles for integrating I/O into languages and show how these principles are applied to a virtual-processor-oriented language. We show how machine-independent modes are used to support both high performance and generality. We describe an automatic mode detection technique that saves the programmer from extra syntax and low-level file system details. We show how virtual processor file operations, typically small by themselves, are combined into efficient large-scale file system calls. Finally, we present a variety of benchmark results detailing design tradeoffs and the performance of various modes.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123743384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Scalable message passing in Panda 熊猫中可伸缩的消息传递
Workshop on I/O in Parallel and Distributed Systems Pub Date : 1996-05-27 DOI: 10.1145/236017.236042
Ying Chen, M. Winslett, K. Seamons, S. Kuo, Yong-Woon Cho, M. Subramaniam
{"title":"Scalable message passing in Panda","authors":"Ying Chen, M. Winslett, K. Seamons, S. Kuo, Yong-Woon Cho, M. Subramaniam","doi":"10.1145/236017.236042","DOIUrl":"https://doi.org/10.1145/236017.236042","url":null,"abstract":"To provide high performance for applications with a wide variety of i/o requirements and to support many different parallel platforms, the design of a parallel i/o system must provide for efficient utilization of available bandwidth both for disk traffic and for message passing. In this paper we discuss the message-passing scalability of the server-directed i/o architecture of Panda, a library for synchronized i/o of multidimensional arrays on parallel platforms. We show how to improve i/o performance in situations where messagepassing is a bottleneck, by combining the server-directed i/o strategy for highly efficient use of available disk bandwidth with new mechanisms to minimize internal communication and computation overhead in Panda. We present experimental results that show that with these improvements, Panda will provide high i/o performance for a wider range of applications, such as applications running with slow interconnects, applications performing i/o operations on large numbers of arrays, or applications that require drastic data rearrangements as data are moved between memory and disk (e.g., array transposition). We also argue that in the future, the improved approach to message-passing will allow Panda to support applications that are not closely synchronized or that run in heterogeneous environments.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129770964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations 太阳能的设计和实现,一个可扩展的核心外线性代数计算的便携式库
Workshop on I/O in Parallel and Distributed Systems Pub Date : 1996-05-27 DOI: 10.1145/236017.236029
Sivan Toledo, F. Gustavson
{"title":"The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations","authors":"Sivan Toledo, F. Gustavson","doi":"10.1145/236017.236029","DOIUrl":"https://doi.org/10.1145/236017.236029","url":null,"abstract":"SOLAR is a portable high-perfonnance library for out-of-core dense matrix computations. It combines portability with high perfonnance by using existing high-perfonnance in-core subroutine libraries and by using an optimized matrix input-output library. SOLAR works on parallel computers, workstations, and personal computers. It supports in-core computations on both shared-memory and distributed-memory machines, and its matrix input-output library supports both conventional 1/0 interfaces and parallel 110 interfaces. This paper discusses the overall design of SOLAR, its interfaces, and the design of several important subroutines. Experimental results show that SOLAR can factor on a single workstation an out-of-core positive-definite symmetric matrix at a rate exceeding 215 Mflops, and an out-of-core general matrix at a rate exceeding 195 Mflops. Less than 16% of the running time is spent on 110 in these computations. These results indicate that SOLAR's portability does not compromise its perfonnance. We expect that the combination of portability, modularity, and the use of a high-level 110 interface will make the library an important platfonn for research on out-of-core algorithms and on parallel 110.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133197322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 103
Bounds on the separation of two parallel disk models 两个平行磁盘模型的分离边界
Workshop on I/O in Parallel and Distributed Systems Pub Date : 1996-05-27 DOI: 10.1145/236017.236044
Chris Armen
{"title":"Bounds on the separation of two parallel disk models","authors":"Chris Armen","doi":"10.1145/236017.236044","DOIUrl":"https://doi.org/10.1145/236017.236044","url":null,"abstract":"The single-disk, D-head model of parallel I/0 was introduced by Agarwal and Vitter to analyze algorithms for problem instances that are too large to fit in primary memory. Subsequently Vitter and Shriver proposed a more realistic model in which the disk space is partitioned into D disks, with a single head per disk. To date, each problem for which there is a known optimal algorithm for both models has the same asymptotic bounds on both models. Therefore, it has been unknown whether the models are equivalent or whether the singledisk model is strictly more powerful. In this pape:r we provide evidence that the single-disk model is strictly more powerful. We prove a lower bound on any general simulation of the single-disk model on the multi-disk model and establish randomized and deterministic upper bounds. Let N be the problem size and let T be the number of parallel I/Os required by a program on the single-disk model. Then any simulation of this pro€:ram on the multi-disk model will require Q ( T 10~/:,C:Ct~b:~) parallel I/Os. This lower bound holds even if replication is allowed in the multi-disk model. *Department of Computer Science, University of Hartford, 200 Bloomfield Avenue, W. Hartford, CT 06117-1599. Email: armenGhartford.edu. This work was done while the author was a graduate student at Dartmouth College. Permission 10 make digital/hard copies of all or part of this material for personal or classroom use its granted without fee provided that the copies a.re not made or distributed for profit or commercial advantage, the copynght notice, the title of the publication and its date appear, and notice is given that copyright is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires specific permission and/or fee. 10PADS'96, Philadelphia PA, USA o 1996 ACM 0-89791-813-4/96/05 .. $3.50 122 We also show an 0 Co~0f0~ D) randomized upper bound and an 0 (log D(log log D) ) deterministic upper bound. These results exploit an interesting analogy between the disk models and the PRAM and DCM models of parallel computation.","PeriodicalId":442608,"journal":{"name":"Workshop on I/O in Parallel and Distributed Systems","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131019475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信