AnnotationGym: A Generic Framework for Automatic Source Code Annotation

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access Pub Date : 2025-09-03 DOI:10.1109/ACCESS.2025.3605852

Hafsah Shahzad;Ahmed Sanaullah;Sanjay Arora;Ulrich Drepper;Martin C. Herbordt

{"title":"AnnotationGym: A Generic Framework for Automatic Source Code Annotation","authors":"Hafsah Shahzad;Ahmed Sanaullah;Sanjay Arora;Ulrich Drepper;Martin C. Herbordt","doi":"10.1109/ACCESS.2025.3605852","DOIUrl":null,"url":null,"abstract":"A common approach to code optimization is to insert compiler hints in the source code using annotations. Two major challenges with using annotations effectively are their complexity and lack of portability. This means, first, that significant developer expertise is required, and, second, that the supported annotations, as well as their syntax and use, can vary substantially. Moreover, there is not currently any tool that can output performant annotation-inserted codes for different back-ends. To address these challenges, we present AnnotationGym, an easy-to-use, open-source, generic infrastructure that supplements or replaces the developer in annotating source code. It demonstrates a novel application of AI methods to code annotation. In addition to improving code performance, the flexibility of AnnotationGym enables easy comparisons of performance and optimization strategies among compilers and target architectures and thus provides an extensible platform to facilitate further progress in this field. AnnotationGym automatically extracts structured information about the target code and compiler to generate a list of possible annotations. AI-based optimization algorithms then traverse this space to determine the best set of annotations depending on the developer goals. To demonstrate its effectiveness, we run AnnotationGym on popular, representative workloads from the Polybench suite, as well as targeting various compilers (GCC, AMD HLS, Intel HLS), optimization algorithms (Reinforcement Learning, Bayesian Optimization), and architectures (CPU, FPGA). We also test our approach on FPGA codes derived, e.g., from the Rodinia and OpenDwarfs benchmarks and that are hand-optimized using standard best practices. An interesting finding is that the best overall performance obtained by AnnotationGym was generally with unoptimized codes.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"155321-155339"},"PeriodicalIF":3.6000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11148243","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11148243/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

A common approach to code optimization is to insert compiler hints in the source code using annotations. Two major challenges with using annotations effectively are their complexity and lack of portability. This means, first, that significant developer expertise is required, and, second, that the supported annotations, as well as their syntax and use, can vary substantially. Moreover, there is not currently any tool that can output performant annotation-inserted codes for different back-ends. To address these challenges, we present AnnotationGym, an easy-to-use, open-source, generic infrastructure that supplements or replaces the developer in annotating source code. It demonstrates a novel application of AI methods to code annotation. In addition to improving code performance, the flexibility of AnnotationGym enables easy comparisons of performance and optimization strategies among compilers and target architectures and thus provides an extensible platform to facilitate further progress in this field. AnnotationGym automatically extracts structured information about the target code and compiler to generate a list of possible annotations. AI-based optimization algorithms then traverse this space to determine the best set of annotations depending on the developer goals. To demonstrate its effectiveness, we run AnnotationGym on popular, representative workloads from the Polybench suite, as well as targeting various compilers (GCC, AMD HLS, Intel HLS), optimization algorithms (Reinforcement Learning, Bayesian Optimization), and architectures (CPU, FPGA). We also test our approach on FPGA codes derived, e.g., from the Rodinia and OpenDwarfs benchmarks and that are hand-optimized using standard best practices. An interesting finding is that the best overall performance obtained by AnnotationGym was generally with unoptimized codes.

查看原文本刊更多论文

一个用于自动源代码注释的通用框架

一种常见的代码优化方法是使用注释在源代码中插入编译器提示。有效使用注释的两个主要挑战是它们的复杂性和缺乏可移植性。这意味着，首先，需要大量的开发人员专业知识，其次，支持的注释及其语法和使用可能会有很大的不同。此外，目前还没有任何工具可以为不同的后端输出高性能的注释插入代码。为了应对这些挑战，我们提出了AnnotationGym，这是一个易于使用的、开放源代码的通用基础设施，可以补充或取代开发人员对源代码进行注释。它演示了人工智能方法在代码注释中的新应用。除了提高代码性能外，AnnotationGym的灵活性还允许在编译器和目标体系结构之间轻松地比较性能和优化策略，从而提供了一个可扩展的平台，以促进该领域的进一步发展。AnnotationGym自动提取有关目标代码和编译器的结构化信息，以生成可能的注释列表。然后，基于人工智能的优化算法遍历这个空间，根据开发人员的目标确定最佳的注释集。为了证明其有效性，我们在Polybench套件中流行的代表性工作负载上运行AnnotationGym，以及针对各种编译器（GCC, AMD HLS, Intel HLS），优化算法（强化学习，贝叶斯优化）和架构（CPU， FPGA）。我们还在FPGA代码上测试了我们的方法，例如，来自Rodinia和OpenDwarfs基准测试，并使用标准最佳实践手工优化。一个有趣的发现是，notationongym获得的最佳总体性能通常是使用未优化的代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

9.80

自引率

7.70%

发文量

6673

审稿时长

6 weeks

期刊介绍： IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.