{"title":"OpenLS-DGF:用于逻辑综合中机器学习任务的自适应开源数据集生成框架","authors":"Liwei Ni;Rui Wang;Miao Liu;Xingyu Meng;Xiaoze Lin;Junfeng Liu;Guojie Luo;Zhufei Chu;Weikang Qian;Xiaoyan Yang;Biwei Xie;Xingquan Li;Huawei Li","doi":"10.1109/TCAD.2025.3555506","DOIUrl":null,"url":null,"abstract":"This article introduces OpenLS-DGF, an adaptive logic synthesis dataset generation framework, to enhance machine-learning (ML) applications within the logic synthesis process. Previous dataset generation flows were tailored for specific tasks or lacked integrated ML capabilities. While OpenLS-DGF supports various ML tasks by encapsulating the three fundamental steps of logic synthesis: 1) Boolean representation; 2) logic optimization; and 3) technology mapping. It preserves the original information in both Verilog and ML-friendly GraphML formats. The Verilog files offer semi-customizable capabilities, enabling researchers to insert additional steps and incrementally refine the generated dataset. Furthermore, OpenLS-DGF includes an adaptive circuit engine that facilitates the final dataset management and downstream tasks. The generated OpenLS-D-v1 dataset comprises 46 combinational designs from established benchmarks, totaling over 966 000 Boolean circuits. OpenLS-D-v1 supports integrating new data features, making it more versatile for new tasks. This article demonstrates the versatility of OpenLS-D-v1 through four distinct downstream tasks: circuit classification, circuit ranking, quality of results (QoR) prediction, and probability prediction. Each task is chosen to represent essential steps of logic synthesis, and the experimental results show the generated dataset from OpenLS-DGF achieves prominent diversity and applicability. The source code and datasets are available at <uri>https://github.com/Logic-Factory/ACE/blob/master/OpenLS-DGF</uri>.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3830-3843"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine-Learning Tasks in Logic Synthesis\",\"authors\":\"Liwei Ni;Rui Wang;Miao Liu;Xingyu Meng;Xiaoze Lin;Junfeng Liu;Guojie Luo;Zhufei Chu;Weikang Qian;Xiaoyan Yang;Biwei Xie;Xingquan Li;Huawei Li\",\"doi\":\"10.1109/TCAD.2025.3555506\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article introduces OpenLS-DGF, an adaptive logic synthesis dataset generation framework, to enhance machine-learning (ML) applications within the logic synthesis process. Previous dataset generation flows were tailored for specific tasks or lacked integrated ML capabilities. While OpenLS-DGF supports various ML tasks by encapsulating the three fundamental steps of logic synthesis: 1) Boolean representation; 2) logic optimization; and 3) technology mapping. It preserves the original information in both Verilog and ML-friendly GraphML formats. The Verilog files offer semi-customizable capabilities, enabling researchers to insert additional steps and incrementally refine the generated dataset. Furthermore, OpenLS-DGF includes an adaptive circuit engine that facilitates the final dataset management and downstream tasks. The generated OpenLS-D-v1 dataset comprises 46 combinational designs from established benchmarks, totaling over 966 000 Boolean circuits. OpenLS-D-v1 supports integrating new data features, making it more versatile for new tasks. This article demonstrates the versatility of OpenLS-D-v1 through four distinct downstream tasks: circuit classification, circuit ranking, quality of results (QoR) prediction, and probability prediction. Each task is chosen to represent essential steps of logic synthesis, and the experimental results show the generated dataset from OpenLS-DGF achieves prominent diversity and applicability. The source code and datasets are available at <uri>https://github.com/Logic-Factory/ACE/blob/master/OpenLS-DGF</uri>.\",\"PeriodicalId\":13251,\"journal\":{\"name\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"volume\":\"44 10\",\"pages\":\"3830-3843\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10943238/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10943238/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine-Learning Tasks in Logic Synthesis
This article introduces OpenLS-DGF, an adaptive logic synthesis dataset generation framework, to enhance machine-learning (ML) applications within the logic synthesis process. Previous dataset generation flows were tailored for specific tasks or lacked integrated ML capabilities. While OpenLS-DGF supports various ML tasks by encapsulating the three fundamental steps of logic synthesis: 1) Boolean representation; 2) logic optimization; and 3) technology mapping. It preserves the original information in both Verilog and ML-friendly GraphML formats. The Verilog files offer semi-customizable capabilities, enabling researchers to insert additional steps and incrementally refine the generated dataset. Furthermore, OpenLS-DGF includes an adaptive circuit engine that facilitates the final dataset management and downstream tasks. The generated OpenLS-D-v1 dataset comprises 46 combinational designs from established benchmarks, totaling over 966 000 Boolean circuits. OpenLS-D-v1 supports integrating new data features, making it more versatile for new tasks. This article demonstrates the versatility of OpenLS-D-v1 through four distinct downstream tasks: circuit classification, circuit ranking, quality of results (QoR) prediction, and probability prediction. Each task is chosen to represent essential steps of logic synthesis, and the experimental results show the generated dataset from OpenLS-DGF achieves prominent diversity and applicability. The source code and datasets are available at https://github.com/Logic-Factory/ACE/blob/master/OpenLS-DGF.
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.