Cecília Conde Kind, Michael Canesche, Fernando Magno Quintão Pereira
{"title":"Jotai: A methodology for the generation of executable C benchmarks","authors":"Cecília Conde Kind, Michael Canesche, Fernando Magno Quintão Pereira","doi":"10.1016/j.cola.2025.101368","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a methodology for automatically generating well-defined executable benchmarks in C. The generation process is fully automatic: C files are extracted from open-source repositories and split into compilation units. A type reconstructor infers all the types and declarations required to ensure that functions compile. The generation of inputs is guided by constraints specified via a domain-specific language. This DSL refines the types of functions, for instance, creating relations between integer arguments and the length of buffers. Off-the-shelf tools such as <span>AddressSanitizer</span> and <span>Kcc</span> filter out programs with undefined behavior. To demonstrate applicability, this paper analyzes the dynamic behavior of different collections of benchmarks, some with up to 30 thousand samples, to support several observations: (i) the speedup of optimizations does not follow a normal distribution—a property assumed by statistical tests such as the T-test and the Z-test; (ii) there is strong correlation between number of instructions fetched and running time in x86 and in ARM processors; hence, the former—a non-varying quantity—can be used as a proxy for the latter—a varying quantity—in the autotuning of compilation tasks. The apparatus to generate benchmarks is publicly available. A collection of 18 thousand programs thus produced is also available as a <span>CompilerGym</span>’s dataset.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"85 ","pages":"Article 101368"},"PeriodicalIF":1.8000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Languages","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590118425000541","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents a methodology for automatically generating well-defined executable benchmarks in C. The generation process is fully automatic: C files are extracted from open-source repositories and split into compilation units. A type reconstructor infers all the types and declarations required to ensure that functions compile. The generation of inputs is guided by constraints specified via a domain-specific language. This DSL refines the types of functions, for instance, creating relations between integer arguments and the length of buffers. Off-the-shelf tools such as AddressSanitizer and Kcc filter out programs with undefined behavior. To demonstrate applicability, this paper analyzes the dynamic behavior of different collections of benchmarks, some with up to 30 thousand samples, to support several observations: (i) the speedup of optimizations does not follow a normal distribution—a property assumed by statistical tests such as the T-test and the Z-test; (ii) there is strong correlation between number of instructions fetched and running time in x86 and in ARM processors; hence, the former—a non-varying quantity—can be used as a proxy for the latter—a varying quantity—in the autotuning of compilation tasks. The apparatus to generate benchmarks is publicly available. A collection of 18 thousand programs thus produced is also available as a CompilerGym’s dataset.