Fast data-dependence profiling through prior static analysis

IF 2 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Parallel Computing Pub Date : 2024-01-11 DOI:10.1016/j.parco.2024.103063

Mohammad Norouzi , Nicolas Morew , Qamar Ilias , Lukas Rothenberger , Ali Jannesari , Felix Wolf

{"title":"Fast data-dependence profiling through prior static analysis","authors":"Mohammad Norouzi , Nicolas Morew , Qamar Ilias , Lukas Rothenberger , Ali Jannesari , Felix Wolf","doi":"10.1016/j.parco.2024.103063","DOIUrl":null,"url":null,"abstract":"<div><p>Data-dependence profiling is a program-analysis technique for detecting parallelism opportunities in sequential programs. It captures data dependences that actually occur during program execution, filtering parallelism-preventing dependences that purely static methods assume only because they lack critical runtime information, such as the values of pointers and array indices. Profiling, however, suffers from high runtime overhead. In our earlier work, we accelerated data-dependence profiling by excluding polyhedral loops that can be handled statically using certain compilers and eliminating scalar variables that create statically-identifiable data dependences. In this paper, we combine the two methods and integrate them into DiscoPoP, a data-dependence profiler and parallelism discovery tool. Additionally, we detect reduction patterns statically and unify the three static analyses with the DiscoPoP framework to significantly diminish the profiling overhead and for a wider range of programs. We have evaluated our unified approaches with 49 benchmarks from three benchmark suites and two computer simulation applications. The evaluation results show that our approach reports fewer false positive and negative data dependences than the original data-dependence profiler and reduces the profiling time by at least 43%, with a median reduction of 76% across all programs. Also, we identify 40% of reduction cases statically and eliminate the associated profiling overhead for these cases.</p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"119 ","pages":"Article 103063"},"PeriodicalIF":2.0000,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167819124000012/pdfft?md5=99e5ae1bcda1fac5d3c65fb23d0ba7f8&pid=1-s2.0-S0167819124000012-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Parallel Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167819124000012","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Data-dependence profiling is a program-analysis technique for detecting parallelism opportunities in sequential programs. It captures data dependences that actually occur during program execution, filtering parallelism-preventing dependences that purely static methods assume only because they lack critical runtime information, such as the values of pointers and array indices. Profiling, however, suffers from high runtime overhead. In our earlier work, we accelerated data-dependence profiling by excluding polyhedral loops that can be handled statically using certain compilers and eliminating scalar variables that create statically-identifiable data dependences. In this paper, we combine the two methods and integrate them into DiscoPoP, a data-dependence profiler and parallelism discovery tool. Additionally, we detect reduction patterns statically and unify the three static analyses with the DiscoPoP framework to significantly diminish the profiling overhead and for a wider range of programs. We have evaluated our unified approaches with 49 benchmarks from three benchmark suites and two computer simulation applications. The evaluation results show that our approach reports fewer false positive and negative data dependences than the original data-dependence profiler and reduces the profiling time by at least 43%, with a median reduction of 76% across all programs. Also, we identify 40% of reduction cases statically and eliminate the associated profiling overhead for these cases.

查看原文本刊更多论文

通过先期静态分析快速剖析数据依赖性

数据依赖性剖析是一种程序分析技术，用于检测顺序程序中的并行性机会。它捕捉程序执行过程中实际发生的数据依赖性，过滤纯静态方法因缺乏关键运行时信息（如指针和数组索引的值等）而假设的防止并行性的依赖性。然而，剖析会带来很高的运行时开销。在我们早期的工作中，我们通过排除可使用某些编译器静态处理的多面体循环，以及消除会产生静态可识别数据依赖性的标量变量，加速了数据依赖性剖析。在本文中，我们将这两种方法结合起来，并将其集成到数据依赖性剖析器和并行性发现工具 DiscoPoP 中。此外，我们还通过静态方式检测还原模式，并将三种静态分析与 DiscoPoP 框架统一起来，从而大幅降低剖析开销，并适用于更广泛的程序。我们用来自三个基准套件和两个计算机模拟应用程序的 49 个基准对我们的统一方法进行了评估。评估结果表明，与原始数据依赖性剖析器相比，我们的方法报告的错误正负数据依赖性更少，剖析时间至少减少了 43%，在所有程序中的中位数减少了 76%。此外，我们还能静态识别 40% 的减少情况，并消除这些情况的相关剖析开销。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Parallel Computing 工程技术-计算机：理论方法

CiteScore

3.50

自引率

7.10%

发文量

审稿时长

4.5 months

期刊介绍： Parallel Computing is an international journal presenting the practical use of parallel computer systems, including high performance architecture, system software, programming systems and tools, and applications. Within this context the journal covers all aspects of high-end parallel computing from single homogeneous or heterogenous computing nodes to large-scale multi-node systems. Parallel Computing features original research work and review articles as well as novel or illustrative accounts of application experience with (and techniques for) the use of parallel computers. We also welcome studies reproducing prior publications that either confirm or disprove prior published results. Particular technical areas of interest include, but are not limited to: -System software for parallel computer systems including programming languages (new languages as well as compilation techniques), operating systems (including middleware), and resource management (scheduling and load-balancing). -Enabling software including debuggers, performance tools, and system and numeric libraries. -General hardware (architecture) concepts, new technologies enabling the realization of such new concepts, and details of commercially available systems -Software engineering and productivity as it relates to parallel computing -Applications (including scientific computing, deep learning, machine learning) or tool case studies demonstrating novel ways to achieve parallelism -Performance measurement results on state-of-the-art systems -Approaches to effectively utilize large-scale parallel computing including new algorithms or algorithm analysis with demonstrated relevance to real applications using existing or next generation parallel computer architectures. -Parallel I/O systems both hardware and software -Networking technology for support of high-speed computing demonstrating the impact of high-speed computation on parallel applications