大规模并行超级计算机中的并行I/O子系统

D. Feitelson, P. Corbett, S. J. Baylor, Yarsun Hsu
{"title":"大规模并行超级计算机中的并行I/O子系统","authors":"D. Feitelson, P. Corbett, S. J. Baylor, Yarsun Hsu","doi":"10.1109/M-PDT.1995.414842","DOIUrl":null,"url":null,"abstract":"Applications on MPPs often require a high aggregate bandwidth of low-latency I/O to secondary storage. This requirement can met by internal parallel I/O subsystems that comprise dedicated I/O nodes, each with processor, memory, and disks.Massively parallel processors (MPPs), encompassing from tens to thousands of processors, are emerging as a major architecture for high-performance computers. Most major computer vendors offer computers with some degree of parallelism, and many smaller vendors specialize in producing MPPs. These machines are targeted for both grand-challenge problems and general-purpose computing.Like any computer, MPP architectural design must balance computation, memory bandwidth and capacity, communication capabilities, and I/O. In the past, most design research focused on the basic compute and communications hardware and software. This led to unbalanced computers that had relatively poor I/O performance. Recently, researchers have focused on designing hardware and software for I/O subsystems in MPPs. Consequently, most current MPPs have an architecture based on an internal parallel I/O subsystem (the \"Architectures with parallel I/O\" sidebar describes some examples). In these computers, this subsystem encompasses a collection of I/O nodes, each managing and providing I/O access to a set of disks. The I/O nodes connect to other nodes in the system by the same switching network that connects the compute nodes.In this article we'll examine why many MPPs use parallel I/O subsystems, what architecture is best for such a subsystem, and how to implement the subsystem. We'll also discuss how parallel file systems and their user interfaces can exploit the parallel I/O to provide enhanced services to applications.The systems discussed in this article are mostly tightly coupled distributed-memory MIMD (multiple-instruction, multiple-data) MPPs. In some cases, we also discuss shared-memory and SIMD (single-instruction, multiple-data) machines. We'll discuss three node types. Compute nodes are optimized to perform floating-point and numeric calculations, and have no local disk except perhaps for paging, booting, and operating-system software. I/O nodes contain the system's secondary storage, and provide the parallel file-system services. Gateway nodes provide connectivity to external data servers and mass-storage systems. In some cases, individual nodes can serve as more than one type. For example, the same nodes often handle I/O and gateway functions. The \"Terminology\" sidebar defines some other terms used in this article.","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":"{\"title\":\"Parallel I/O subsystems in massively parallel supercomputers\",\"authors\":\"D. Feitelson, P. Corbett, S. J. Baylor, Yarsun Hsu\",\"doi\":\"10.1109/M-PDT.1995.414842\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Applications on MPPs often require a high aggregate bandwidth of low-latency I/O to secondary storage. This requirement can met by internal parallel I/O subsystems that comprise dedicated I/O nodes, each with processor, memory, and disks.Massively parallel processors (MPPs), encompassing from tens to thousands of processors, are emerging as a major architecture for high-performance computers. Most major computer vendors offer computers with some degree of parallelism, and many smaller vendors specialize in producing MPPs. These machines are targeted for both grand-challenge problems and general-purpose computing.Like any computer, MPP architectural design must balance computation, memory bandwidth and capacity, communication capabilities, and I/O. In the past, most design research focused on the basic compute and communications hardware and software. This led to unbalanced computers that had relatively poor I/O performance. Recently, researchers have focused on designing hardware and software for I/O subsystems in MPPs. Consequently, most current MPPs have an architecture based on an internal parallel I/O subsystem (the \\\"Architectures with parallel I/O\\\" sidebar describes some examples). In these computers, this subsystem encompasses a collection of I/O nodes, each managing and providing I/O access to a set of disks. The I/O nodes connect to other nodes in the system by the same switching network that connects the compute nodes.In this article we'll examine why many MPPs use parallel I/O subsystems, what architecture is best for such a subsystem, and how to implement the subsystem. We'll also discuss how parallel file systems and their user interfaces can exploit the parallel I/O to provide enhanced services to applications.The systems discussed in this article are mostly tightly coupled distributed-memory MIMD (multiple-instruction, multiple-data) MPPs. In some cases, we also discuss shared-memory and SIMD (single-instruction, multiple-data) machines. We'll discuss three node types. Compute nodes are optimized to perform floating-point and numeric calculations, and have no local disk except perhaps for paging, booting, and operating-system software. I/O nodes contain the system's secondary storage, and provide the parallel file-system services. Gateway nodes provide connectivity to external data servers and mass-storage systems. In some cases, individual nodes can serve as more than one type. For example, the same nodes often handle I/O and gateway functions. The \\\"Terminology\\\" sidebar defines some other terms used in this article.\",\"PeriodicalId\":325213,\"journal\":{\"name\":\"IEEE Parallel & Distributed Technology: Systems & Applications\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Parallel & Distributed Technology: Systems & Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/M-PDT.1995.414842\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Parallel & Distributed Technology: Systems & Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/M-PDT.1995.414842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33

摘要

mpp上的应用程序通常需要低延迟I/O到辅助存储的高聚合带宽。这种需求可以通过内部并行I/O子系统来满足,这些子系统包括专用的I/O节点,每个节点都有处理器、内存和磁盘。大规模并行处理器(mpp),包括数十到数千个处理器,正在成为高性能计算机的主要体系结构。大多数主要的计算机供应商提供具有一定程度并行性的计算机,许多较小的供应商专门生产mpp。这些机器的目标是解决重大挑战问题和通用计算。与任何计算机一样,MPP体系结构设计必须平衡计算、内存带宽和容量、通信能力和I/O。过去,大多数设计研究都集中在基本的计算和通信硬件和软件上。这导致了I/O性能相对较差的计算机不平衡。近年来,研究人员一直关注mpp中I/O子系统的硬件和软件设计。因此,大多数当前mpp都有一个基于内部并行I/O子系统的体系结构(“并行I/O体系结构”侧栏描述了一些示例)。在这些计算机中,这个子系统包含一组I/O节点,每个节点管理并提供对一组磁盘的I/O访问。I/O节点通过与计算节点连接的交换网络与系统中的其他节点连接。在本文中,我们将研究为什么许多mpp使用并行I/O子系统,哪种体系结构最适合这种子系统,以及如何实现子系统。我们还将讨论并行文件系统及其用户界面如何利用并行I/O为应用程序提供增强的服务。本文中讨论的系统大多是紧耦合的分布式内存MIMD(多指令、多数据)mpp。在某些情况下,我们还讨论共享内存和SIMD(单指令多数据)机器。我们将讨论三种节点类型。计算节点被优化为执行浮点和数值计算,并且除了分页、引导和操作系统软件之外,没有本地磁盘。I/O节点包含系统的二级存储,并提供并行文件系统服务。网关节点提供与外部数据服务器和海量存储系统的连接。在某些情况下,单个节点可以作为不止一种类型。例如,相同的节点通常处理I/O和网关功能。“术语”侧栏定义了本文中使用的其他一些术语。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Parallel I/O subsystems in massively parallel supercomputers
Applications on MPPs often require a high aggregate bandwidth of low-latency I/O to secondary storage. This requirement can met by internal parallel I/O subsystems that comprise dedicated I/O nodes, each with processor, memory, and disks.Massively parallel processors (MPPs), encompassing from tens to thousands of processors, are emerging as a major architecture for high-performance computers. Most major computer vendors offer computers with some degree of parallelism, and many smaller vendors specialize in producing MPPs. These machines are targeted for both grand-challenge problems and general-purpose computing.Like any computer, MPP architectural design must balance computation, memory bandwidth and capacity, communication capabilities, and I/O. In the past, most design research focused on the basic compute and communications hardware and software. This led to unbalanced computers that had relatively poor I/O performance. Recently, researchers have focused on designing hardware and software for I/O subsystems in MPPs. Consequently, most current MPPs have an architecture based on an internal parallel I/O subsystem (the "Architectures with parallel I/O" sidebar describes some examples). In these computers, this subsystem encompasses a collection of I/O nodes, each managing and providing I/O access to a set of disks. The I/O nodes connect to other nodes in the system by the same switching network that connects the compute nodes.In this article we'll examine why many MPPs use parallel I/O subsystems, what architecture is best for such a subsystem, and how to implement the subsystem. We'll also discuss how parallel file systems and their user interfaces can exploit the parallel I/O to provide enhanced services to applications.The systems discussed in this article are mostly tightly coupled distributed-memory MIMD (multiple-instruction, multiple-data) MPPs. In some cases, we also discuss shared-memory and SIMD (single-instruction, multiple-data) machines. We'll discuss three node types. Compute nodes are optimized to perform floating-point and numeric calculations, and have no local disk except perhaps for paging, booting, and operating-system software. I/O nodes contain the system's secondary storage, and provide the parallel file-system services. Gateway nodes provide connectivity to external data servers and mass-storage systems. In some cases, individual nodes can serve as more than one type. For example, the same nodes often handle I/O and gateway functions. The "Terminology" sidebar defines some other terms used in this article.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信