Exploring MPI Collective I/O and File-per-process I/O for Checkpointing a Logical Inference Task

Ke Fan, Kristopher K. Micinski, Thomas Gilray, Sidharth Kumar
{"title":"Exploring MPI Collective I/O and File-per-process I/O for Checkpointing a Logical Inference Task","authors":"Ke Fan, Kristopher K. Micinski, Thomas Gilray, Sidharth Kumar","doi":"10.1109/IPDPSW52791.2021.00153","DOIUrl":null,"url":null,"abstract":"We present a scalable parallel I/O system for a logical-inferencing application built atop a deductive database. Deductive databases can make logical deductions (i.e. conclude additional facts), based on a set of program rules, derived from facts already in the database. Datalog is a language or family of languages commonly used to specify rules and queries for a deductive database. Applications built using Datalog can range from graph mining (such as computing transitive closure or k-cliques) to program analysis (control and data-flow analysis). In our previous papers, we presented the first implementation of a data-parallel Datalog built using MPI. In this paper, we present a parallel I/O system used to checkpoint and restart applications built on top of our Datalog system. State of the art Datalog implementations, such as Soufflé, only support serial I/O, mainly because the implementation itself does not support many-node parallel execution.Computing the transitive closure of a graph is one of the simplest logical-inferencing applications built using Datalog; we use it as a micro-benchmark to demonstrate the efficacy of our parallel I/O system. Internally, we use a nested B-tree data-structure to facilitate fast and efficient in-memory access to relational data. Our I/O system therefore involves two steps, converting the application data-layout (a nested B-tree) to a stream of bytes followed by the actual parallel I/O. We explore two popular I/O techniques POSIX I/O and MPI collective I/O. For extracting performance out of MPI Collective I/O we use adaptive striping, and for POSIX I/O we use file-per-process I/O. We demonstrate the scalability of our system at up to 4,096 processes on the Theta supercomputer at the Argonne National Laboratory.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW52791.2021.00153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

We present a scalable parallel I/O system for a logical-inferencing application built atop a deductive database. Deductive databases can make logical deductions (i.e. conclude additional facts), based on a set of program rules, derived from facts already in the database. Datalog is a language or family of languages commonly used to specify rules and queries for a deductive database. Applications built using Datalog can range from graph mining (such as computing transitive closure or k-cliques) to program analysis (control and data-flow analysis). In our previous papers, we presented the first implementation of a data-parallel Datalog built using MPI. In this paper, we present a parallel I/O system used to checkpoint and restart applications built on top of our Datalog system. State of the art Datalog implementations, such as Soufflé, only support serial I/O, mainly because the implementation itself does not support many-node parallel execution.Computing the transitive closure of a graph is one of the simplest logical-inferencing applications built using Datalog; we use it as a micro-benchmark to demonstrate the efficacy of our parallel I/O system. Internally, we use a nested B-tree data-structure to facilitate fast and efficient in-memory access to relational data. Our I/O system therefore involves two steps, converting the application data-layout (a nested B-tree) to a stream of bytes followed by the actual parallel I/O. We explore two popular I/O techniques POSIX I/O and MPI collective I/O. For extracting performance out of MPI Collective I/O we use adaptive striping, and for POSIX I/O we use file-per-process I/O. We demonstrate the scalability of our system at up to 4,096 processes on the Theta supercomputer at the Argonne National Laboratory.
探索MPI集体I/O和文件每进程I/O检查点逻辑推理任务
我们提出了一个可扩展的并行I/O系统,用于构建在演绎数据库之上的逻辑推理应用程序。演绎数据库可以根据一组程序规则,从数据库中已经存在的事实中推导出逻辑演绎(即推断出额外的事实)。Datalog是一种或一系列语言,通常用于为演绎数据库指定规则和查询。使用Datalog构建的应用程序可以从图挖掘(例如计算传递闭包或k-cliques)到程序分析(控制和数据流分析)。在我们之前的论文中,我们介绍了第一个使用MPI构建的数据并行Datalog的实现。在本文中,我们提出了一个并行I/O系统,用于检查点和重新启动构建在Datalog系统之上的应用程序。最先进的Datalog实现,如souffl,只支持串行I/O,主要是因为实现本身不支持多节点并行执行。计算图的传递闭包是使用Datalog构建的最简单的逻辑推理应用程序之一;我们使用它作为微基准来演示并行I/O系统的效率。在内部,我们使用嵌套的b树数据结构来促进对关系数据的快速和有效的内存访问。因此,我们的I/O系统涉及两个步骤,将应用程序数据布局(一个嵌套的b树)转换为字节流,然后是实际的并行I/O。我们将探讨两种流行的I/O技术POSIX I/O和MPI集体I/O。为了从MPI Collective I/O中提取性能,我们使用自适应条带,对于POSIX I/O,我们使用每进程文件I/O。我们在阿贡国家实验室的Theta超级计算机上展示了我们系统的可扩展性,最多可达4,096个进程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信