arXiv - CS - Programming Languages最新文献_第10页

Empowering LLMs for Verilog Generation through Multi-Level Summarization 通过多层次总结为 Verilog 生成赋予 LLM 能力

arXiv - CS - Programming Languages Pub Date : 2024-07-15 DOI: arxiv-2407.10424

Yang Zhao, Di Huang, Chongxiao Li, Pengwei Jin, Ziyuan Nan, Tianyun Ma, Lei Qi, Yansong Pan, Zhenxing Zhang, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen

{"title":"Empowering LLMs for Verilog Generation through Multi-Level Summarization","authors":"Yang Zhao, Di Huang, Chongxiao Li, Pengwei Jin, Ziyuan Nan, Tianyun Ma, Lei Qi, Yansong Pan, Zhenxing Zhang, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen","doi":"arxiv-2407.10424","DOIUrl":"https://doi.org/arxiv-2407.10424","url":null,"abstract":"The increasing complexity and high costs associated with modern processor\u0000design have led to a surge in demand for processor design automation.\u0000Instruction-tuned large language models (LLMs) have demonstrated remarkable\u0000performance in automatically generating code for general-purpose programming\u0000languages like Python. However, these methods fail on hardware description\u0000languages (HDLs) like Verilog due to the scarcity of high-quality instruction\u0000tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on\u0000Verilog generation. Regarding this issue, we observe that (1) Verilog code\u0000collected from the real world has higher quality than those generated by LLMs.\u0000(2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating\u0000it. Based on these observations, this paper introduces CodeV, a series of\u0000open-source instruction-tuned Verilog generation LLMs. Instead of generating\u0000descriptions first and then getting the corresponding code from advanced LLMs,\u0000we prompt the LLM with Verilog code and let the LLM generate the corresponding\u0000natural language description by multi-level summarization. Experimental results\u0000show that CodeV relatively surpasses the previous open-source SOTA by 14.4%\u0000(BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also\u0000relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Curriculum Learning for Small Code Language Models 小代码语言模型的课程学习

arXiv - CS - Programming Languages Pub Date : 2024-07-14 DOI: arxiv-2407.10194

Marwa Naïr, Kamel Yamani, Lynda Said Lhadj, Riyadh Baghdadi

{"title":"Curriculum Learning for Small Code Language Models","authors":"Marwa Naïr, Kamel Yamani, Lynda Said Lhadj, Riyadh Baghdadi","doi":"arxiv-2407.10194","DOIUrl":"https://doi.org/arxiv-2407.10194","url":null,"abstract":"Code language models have emerged as useful tools for various programming\u0000tasks, yet they often struggle when it comes to complex ones. In this paper, we\u0000explore the potential of curriculum learning in enhancing the performance of\u0000these models. While prior research has suggested that curriculum learning does\u0000not necessarily help in improving the performance of language models, our\u0000results surprisingly show that this may not be the case for code language\u0000models. We demonstrate that a well-designed curriculum learning approach\u0000significantly improves the accuracy of small decoder-only code language models\u0000on the task of code execution, while its effect on code completion is less\u0000significant. To explore the potential of curriculum learning, we train multiple\u0000GPT models with 1 million parameters each to predict the next token and\u0000evaluate them on code completion and execution tasks. Our contributions include\u0000proposing a novel code difficulty assessment metric by combining software code\u0000measures, investigating the effectiveness of Curriculum Learning for code\u0000language models, and introducing a Novel Curriculum Learning schedule that\u0000enhances the performance of small decoder-only language models in code\u0000execution tasks. The results of this paper open the door for more research on\u0000the use of curriculum learning for code language models.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Defining Name Accessibility using Scope Graphs (Extended Edition) 使用范围图定义名称的可访问性（扩展版）

arXiv - CS - Programming Languages Pub Date : 2024-07-12 DOI: arxiv-2407.09320

Aron Zwaan, Casper Bach Poulsen

{"title":"Defining Name Accessibility using Scope Graphs (Extended Edition)","authors":"Aron Zwaan, Casper Bach Poulsen","doi":"arxiv-2407.09320","DOIUrl":"https://doi.org/arxiv-2407.09320","url":null,"abstract":"Many programming languages allow programmers to regulate accessibility; i.e.,\u0000annotating a declaration with keywords such as export and private to indicate\u0000where it can be accessed. Despite the importance of name accessibility for,\u0000e.g., compilers, editor auto-completion and tooling, and automated\u0000refactorings, few existing type systems provide a formal account of name\u0000accessibility. We present a declarative, executable, and language-parametric model for name\u0000accessibility, which provides a formal specification of name accessibility in\u0000Java, C#, C++, Rust, and Eiffel. We achieve this by defining name accessibility\u0000as a predicate on resolution paths through scope graphs. Since scope graphs are\u0000a language-independent model of name resolution, our model provides a uniform\u0000approach to defining different accessibility policies for different languages. Our model is implemented in Statix, a logic language for executable type\u0000system specification using scope graphs. We evaluate its correctness on a test\u0000suite that compares it with the C#, Java, and Rust compilers, and show we can\u0000synthesize access modifiers in programs with holes accurately.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141722173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Refinements for Multiparty Message-Passing Protocols: Specification-agnostic theory and implementation 多方消息传递协议的改进：与规范无关的理论与实现

arXiv - CS - Programming Languages Pub Date : 2024-07-12 DOI: arxiv-2407.09106

Vassor Martin, Yoshida Nobuko

{"title":"Refinements for Multiparty Message-Passing Protocols: Specification-agnostic theory and implementation","authors":"Vassor Martin, Yoshida Nobuko","doi":"arxiv-2407.09106","DOIUrl":"https://doi.org/arxiv-2407.09106","url":null,"abstract":"Multiparty message-passing protocols are notoriously difficult to design, due\u0000to interaction mismatches that lead to errors such as deadlocks. Existing\u0000protocol specification formats have been developed to prevent such errors (e.g.\u0000multiparty session types (MPST)). In order to further constrain protocols,\u0000specifications can be extended with refinements, i.e. logical predicates to\u0000control the behaviour of the protocol based on previous values exchanged.\u0000Unfortunately, existing refinement theories and implementations are tightly\u0000coupled with specification formats. This paper proposes a framework for\u0000multiparty message-passing protocols with refinements and its implementation in\u0000Rust. Our work decouples correctness of refinements from the underlying model\u0000of computation, which results in a specification-agnostic framework. Our\u0000contributions are threefold. First, we introduce a trace system which\u0000characterises valid refined traces, i.e. a sequence of sending and receiving\u0000actions correct with respect to refinements. Second, we give a correct model of\u0000computation named refined communicating system (RCS), which is an extension of\u0000communicating automata systems with refinements. We prove that RCS only produce\u0000valid refined traces. We show how to generate RCS from mainstream protocol\u0000specification formats, such as refined multiparty session types (RMPST) or\u0000refined choreography automata. Third, we illustrate the flexibility of the\u0000framework by developing both a static analysis technique and an improved model\u0000of computation for dynamic refinement evaluation. Finally, we provide a Rust\u0000toolchain for decentralised RMPST, evaluate our implementation with a set of\u0000benchmarks from the literature, and observe that refinement overhead is\u0000negligible.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Higher-Order Specificationsfor Deductive Synthesis of Programs with Pointers (Extended version) 带指针程序演绎合成的高阶规范（扩展版）

arXiv - CS - Programming Languages Pub Date : 2024-07-12 DOI: arxiv-2407.09143

David Young, Ziyi Yang, Ilya Sergey, Alex Potanin

{"title":"Higher-Order Specificationsfor Deductive Synthesis of Programs with Pointers (Extended version)","authors":"David Young, Ziyi Yang, Ilya Sergey, Alex Potanin","doi":"arxiv-2407.09143","DOIUrl":"https://doi.org/arxiv-2407.09143","url":null,"abstract":"Synthetic Separation Logic (SSL) is a formalism that powers SuSLik, the\u0000state-of-the-art approach for the deductive synthesis of provably-correct\u0000programs in C-like languages that manipulate Heap-based linked data structures.\u0000Despite its expressivity, SSL suffers from two shortcomings that hinder its\u0000utility. First, its main specification component, inductive predicates, only\u0000admits emph{first-order} definitions of data structure shapes, which leads to\u0000the proliferation of ``boiler-plate'' predicates for specifying common\u0000patterns. Second, SSL requires emph{concrete} definitions of data structures\u0000to synthesise programs that manipulate them, which results in the need to\u0000change a specification for a synthesis task every time changes are introduced\u0000into the layout of the involved structures. We propose to significantly lift the level of abstraction used in writing\u0000Separation Logic specifications for synthesis -- both simplifying the approach\u0000and making the specifications more usable and easy to read and follow. We avoid\u0000the need to repetitively re-state low-level representation details throughout\u0000the specifications -- allowing the reuse of different implementations of the\u0000same data structure by abstracting away the details of a specific layout used\u0000in memory. Our novel textit{high-level front-end language} called Pika\u0000significantly improves the expressiveness of SuSLik. We implemented a layout-agnostic synthesiser from Pika to SuSLik enabling\u0000push-button synthesis of C programs with in-place memory updates, along with\u0000the accompanying full proofs that they meet Separation Logic-style\u0000specifications, from high-level specifications that resemble ordinary\u0000functional programs. Our experiments show that our tool can produce C code that\u0000is comparable in its performance characteristics and is sometimes faster than\u0000Haskell.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141722174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pragmatics of Formally Verified Yet Efficient Static Analysis, in particular for Formally Verified Compilers 形式验证但高效静态分析的实用性，尤其是针对形式验证编译器的静态分析

arXiv - CS - Programming Languages Pub Date : 2024-07-11 DOI: arxiv-2407.08258

David MonniauxVERIMAG - IMAG

引用次数: 0

Functional Programming in Learning Electromagnetic Theory 学习电磁理论中的函数式编程

arXiv - CS - Programming Languages Pub Date : 2024-07-10 DOI: arxiv-2407.08090

Scott N. WalckLebanon Valley College

引用次数: 0

Mover Logic: A Concurrent Program Logic for Reduction and Rely-Guarantee Reasoning (Extended Version) Mover Logic：用于还原和可靠保证推理的并行程序逻辑（扩展版）

arXiv - CS - Programming Languages Pub Date : 2024-07-10 DOI: arxiv-2407.08070

Cormac Flanagan, Stephen N. Freund

{"title":"Mover Logic: A Concurrent Program Logic for Reduction and Rely-Guarantee Reasoning (Extended Version)","authors":"Cormac Flanagan, Stephen N. Freund","doi":"arxiv-2407.08070","DOIUrl":"https://doi.org/arxiv-2407.08070","url":null,"abstract":"Rely-guarantee (RG) logic uses thread interference specifications (relies and\u0000guarantees) to reason about the correctness of multithreaded software.\u0000Unfortunately, RG logic requires each function postcondition to be \"stabilized\"\u0000or specialized to the behavior of other threads, making it difficult to write\u0000function specifications that are reusable at multiple call sites. This paper presents mover logic, which extends RG logic to address this\u0000problem via the notion of atomic functions. Atomic functions behave as if they\u0000execute serially without interference from concurrent threads, and so they can\u0000be assigned more general and reusable specifications that avoid the\u0000stabilization requirement of RG logic. Several practical verifiers (Calvin-R,\u0000QED, CIVL, Armada, Anchor, etc.) have demonstrated the modularity benefits of\u0000atomic function specifications. However, the complexity of these systems and\u0000their correctness proofs makes it challenging to understand and extend these\u0000systems. Mover logic formalizes the central ideas reduction in a declarative\u0000program logic that may provide foundation for future work in this area.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141609516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Systematic Mapping Study on Teaching of Security Concepts in Programming Courses 程序设计课程中安全概念教学的系统映射研究

arXiv - CS - Programming Languages Pub Date : 2024-07-10 DOI: arxiv-2407.07511

Alina Torbunova, Adnan Ashraf, Ivan Porres

{"title":"A Systematic Mapping Study on Teaching of Security Concepts in Programming Courses","authors":"Alina Torbunova, Adnan Ashraf, Ivan Porres","doi":"arxiv-2407.07511","DOIUrl":"https://doi.org/arxiv-2407.07511","url":null,"abstract":"Context: To effectively defend against ever-evolving cybersecurity threats,\u0000software systems should be made as secure as possible. To achieve this,\u0000software developers should understand potential vulnerabilities and apply\u0000secure coding practices. To prepare these skilled professionals, it is\u0000important that cybersecurity concepts are included in programming courses\u0000taught at universities. Objective: To present a comprehensive and unbiased\u0000literature review on teaching of cybersecurity concepts in programming courses\u0000taught at universities. Method: We perform a Systematic Mapping Study. We\u0000present six research questions, define our selection criteria, and develop a\u0000classification scheme. Results and Conclusions: We select 24 publications. Our\u0000results show a wide range of research contributions. We also outline guidelines\u0000and identify opportunities for future studies. The guidelines include coverage\u0000of security knowledge categories and evaluation of contributions. We suggest\u0000that future studies should cover security issues, negative impacts, and\u0000countermeasures, as well as apply evaluation techniques that examine students'\u0000knowledge. The opportunities for future studies are related to advanced\u0000courses, security knowledge frameworks, and programming environments.\u0000Furthermore, there is a need of a holistic security framework that covers the\u0000security concepts identified in this study and is suitable for education.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141587028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Programming Language Case Studies Can Be Deep 编程语言案例研究可以很深入

arXiv - CS - Programming Languages Pub Date : 2024-07-10 DOI: arxiv-2407.08091

Rose BohrerWorcester Polytechnic Institute

引用次数: 0