B. Guimarães, José Wesley de S. Magalhães, A. F. Silva, F. Pereira
{"title":"Synthesis of Benchmarks for the C Programming Language by Mining Software Repositories","authors":"B. Guimarães, José Wesley de S. Magalhães, A. F. Silva, F. Pereira","doi":"10.1145/3355378.3355380","DOIUrl":null,"url":null,"abstract":"Compilers are usually distributed with a test framework. This framework supports the task of tuning optimizations and static analyses. As an example, clang has a test suite that, in March 2019, counted 259 benchmarks. Although in principle a large collection, this number is small once we consider the needs of the automatic tuning techniques that became fashionable recently. To mitigate the problems caused by such lack of benchmarks, this paper introduces a technique that allows the automatic construction of compilable programs out of open-source repositories. Our approach has made it possible to build, in less than 24 hours, a collection with over 500 thousand functions that clang can compile. In this paper, we show that such abundance of data gives us precise information about the behavior of compiler optimizations, and lets us create accurate prediction models. This collection of benchmarks is today freely available to the open-source community.","PeriodicalId":429937,"journal":{"name":"Proceedings of the XXIII Brazilian Symposium on Programming Languages","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the XXIII Brazilian Symposium on Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3355378.3355380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Compilers are usually distributed with a test framework. This framework supports the task of tuning optimizations and static analyses. As an example, clang has a test suite that, in March 2019, counted 259 benchmarks. Although in principle a large collection, this number is small once we consider the needs of the automatic tuning techniques that became fashionable recently. To mitigate the problems caused by such lack of benchmarks, this paper introduces a technique that allows the automatic construction of compilable programs out of open-source repositories. Our approach has made it possible to build, in less than 24 hours, a collection with over 500 thousand functions that clang can compile. In this paper, we show that such abundance of data gives us precise information about the behavior of compiler optimizations, and lets us create accurate prediction models. This collection of benchmarks is today freely available to the open-source community.