Hong Yi Lin, Patanamon Thongtanunam, Christoph Treude, Michael W. Godfrey, Chunhua Liu, Wachiraphan Charoenwet
{"title":"Leveraging Reviewer Experience in Code Review Comment Generation","authors":"Hong Yi Lin, Patanamon Thongtanunam, Christoph Treude, Michael W. Godfrey, Chunhua Liu, Wachiraphan Charoenwet","doi":"arxiv-2409.10959","DOIUrl":"https://doi.org/arxiv-2409.10959","url":null,"abstract":"Modern code review is a ubiquitous software quality assurance process aimed\u0000at identifying potential issues within newly written code. Despite its\u0000effectiveness, the process demands large amounts of effort from the human\u0000reviewers involved. To help alleviate this workload, researchers have trained\u0000deep learning models to imitate human reviewers in providing natural language\u0000code reviews. Formally, this task is known as code review comment generation.\u0000Prior work has demonstrated improvements in this task by leveraging machine\u0000learning techniques and neural models, such as transfer learning and the\u0000transformer architecture. However, the quality of the model generated reviews\u0000remain sub-optimal due to the quality of the open-source code review data used\u0000in model training. This is in part due to the data obtained from open-source\u0000projects where code reviews are conducted in a public forum, and reviewers\u0000possess varying levels of software development experience, potentially\u0000affecting the quality of their feedback. To accommodate for this variation, we\u0000propose a suite of experience-aware training methods that utilise the\u0000reviewers' past authoring and reviewing experiences as signals for review\u0000quality. Specifically, we propose experience-aware loss functions (ELF), which\u0000use the reviewers' authoring and reviewing ownership of a project as weights in\u0000the model's loss function. Through this method, experienced reviewers' code\u0000reviews yield larger influence over the model's behaviour. Compared to the SOTA\u0000model, ELF was able to generate higher quality reviews in terms of accuracy,\u0000informativeness, and comment types generated. The key contribution of this work\u0000is the demonstration of how traditional software engineering concepts such as\u0000reviewer experience can be integrated into the design of AI-based automated\u0000code review models.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuqing Li, Binchang Li, Yepang Liu, Cuiyun Gao, Jianping Zhang, Shing-Chi Cheung, Michael R. Lyu
{"title":"Context-Dependent Interactable Graphical User Interface Element Detection for VR Applications","authors":"Shuqing Li, Binchang Li, Yepang Liu, Cuiyun Gao, Jianping Zhang, Shing-Chi Cheung, Michael R. Lyu","doi":"arxiv-2409.10811","DOIUrl":"https://doi.org/arxiv-2409.10811","url":null,"abstract":"In recent years, Virtual Reality (VR) has emerged as a transformative\u0000technology, offering users immersive and interactive experiences across\u0000diversified virtual environments. Users can interact with VR apps through\u0000interactable GUI elements (IGEs) on the stereoscopic three-dimensional (3D)\u0000graphical user interface (GUI). The accurate recognition of these IGEs is\u0000instrumental, serving as the foundation of many software engineering tasks,\u0000including automated testing and effective GUI search. The most recent IGE\u0000detection approaches for 2D mobile apps typically train a supervised object\u0000detection model based on a large-scale manually-labeled GUI dataset, usually\u0000with a pre-defined set of clickable GUI element categories like buttons and\u0000spinners. Such approaches can hardly be applied to IGE detection in VR apps,\u0000due to a multitude of challenges including complexities posed by\u0000open-vocabulary and heterogeneous IGE categories, intricacies of\u0000context-sensitive interactability, and the necessities of precise spatial\u0000perception and visual-semantic alignment for accurate IGE detection results.\u0000Thus, it is necessary to embark on the IGE research tailored to VR apps. In\u0000this paper, we propose the first zero-shot cOntext-sensitive inteRactable GUI\u0000ElemeNT dEtection framework for virtual Reality apps, named Orienter. By\u0000imitating human behaviors, Orienter observes and understands the semantic\u0000contexts of VR app scenes first, before performing the detection. The detection\u0000process is iterated within a feedback-directed validation and reflection loop.\u0000Specifically, Orienter contains three components, including (1) Semantic\u0000context comprehension, (2) Reflection-directed IGE candidate detection, and (3)\u0000Context-sensitive interactability classification. Extensive experiments on the\u0000dataset demonstrate that Orienter is more effective than the state-of-the-art\u0000GUI element detection approaches.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler","authors":"Nazim Bendib, Iheb Nassim Aouadj, Riyadh Baghdadi","doi":"arxiv-2409.11068","DOIUrl":"https://doi.org/arxiv-2409.11068","url":null,"abstract":"Code optimization is a crucial task aimed at enhancing code performance.\u0000However, this process is often tedious and complex, highlighting the necessity\u0000for automatic code optimization techniques. Reinforcement Learning (RL), a\u0000machine learning technique, has emerged as a promising approach for tackling\u0000such complex optimization problems. In this project, we introduce the first RL\u0000environment for the MLIR compiler, dedicated to facilitating MLIR compiler\u0000research, and enabling automatic code optimization using Multi-Action\u0000Reinforcement Learning. We also propose a novel formulation of the action space\u0000as a Cartesian product of simpler action subspaces, enabling more efficient and\u0000effective optimizations. Experimental results demonstrate that our proposed\u0000environment allows for an effective optimization of MLIR operations, and yields\u0000comparable performance to TensorFlow, surpassing it in multiple cases,\u0000highlighting the potential of RL-based optimization in compiler frameworks.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Integration of Multi Active/Passive Core-Agents","authors":"Amine B. Hassouna, Hana Chaari, Ines Belhaj","doi":"arxiv-2409.11393","DOIUrl":"https://doi.org/arxiv-2409.11393","url":null,"abstract":"The integration of tools in LLM-based agents overcame the difficulties of\u0000standalone LLMs and traditional agents' limited capabilities. However, the\u0000conjunction of these technologies and the proposed enhancements in several\u0000state-of-the-art works followed a non-unified software architecture resulting\u0000in a lack of modularity. Indeed, they focused mainly on functionalities and\u0000overlooked the definition of the component's boundaries within the agent. This\u0000caused terminological and architectural ambiguities between researchers which\u0000we addressed in this paper by proposing a unified framework that establishes a\u0000clear foundation for LLM-based agents' development from both functional and\u0000software architectural perspectives. Our framework, LLM-Agent-UMF (LLM-based Agent Unified Modeling Framework),\u0000clearly distinguishes between the different components of an agent, setting\u0000LLMs, and tools apart from a newly introduced element: the core-agent, playing\u0000the role of the central coordinator of the agent which comprises five modules:\u0000planning, memory, profile, action, and security, the latter often neglected in\u0000previous works. Differences in the internal structure of core-agents led us to\u0000classify them into a taxonomy of passive and active types. Based on this, we\u0000proposed different multi-core agent architectures combining unique\u0000characteristics of various individual agents. For evaluation purposes, we applied this framework to a selection of\u0000state-of-the-art agents, thereby demonstrating its alignment with their\u0000functionalities and clarifying the overlooked architectural aspects. Moreover,\u0000we thoroughly assessed four of our proposed architectures by integrating\u0000distinctive agents into hybrid active/passive core-agents' systems. This\u0000analysis provided clear insights into potential improvements and highlighted\u0000the challenges involved in the combination of specific agents.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"118 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SuperCoder2.0: Technical Report on Exploring the feasibility of LLMs as Autonomous Programmer","authors":"Anmol Gautam, Kishore Kumar, Adarsh Jha, Mukunda NS, Ishaan Bhola","doi":"arxiv-2409.11190","DOIUrl":"https://doi.org/arxiv-2409.11190","url":null,"abstract":"We present SuperCoder2.0, an advanced autonomous system designed to enhance\u0000software development through artificial intelligence. The system combines an\u0000AI-native development approach with intelligent agents to enable fully\u0000autonomous coding. Key focus areas include a retry mechanism with error output\u0000traceback, comprehensive code rewriting and replacement using Abstract Syntax\u0000Tree (ast) parsing to minimize linting issues, code embedding technique for\u0000retrieval-augmented generation, and a focus on localizing methods for\u0000problem-solving rather than identifying specific line numbers. The methodology\u0000employs a three-step hierarchical search space reduction approach for code base\u0000navigation and bug localization:utilizing Retrieval Augmented Generation (RAG)\u0000and a Repository File Level Map to identify candidate files, (2) narrowing down\u0000to the most relevant files using a File Level Schematic Map, and (3) extracting\u0000'relevant locations' within these files. Code editing is performed through a\u0000two-part module comprising CodeGeneration and CodeEditing, which generates\u0000multiple solutions at different temperature values and replaces entire methods\u0000or classes to maintain code integrity. A feedback loop executes\u0000repository-level test cases to validate and refine solutions. Experiments\u0000conducted on the SWE-bench Lite dataset demonstrate SuperCoder2.0's\u0000effectiveness, achieving correct file localization in 84.33% of cases within\u0000the top 5 candidates and successfully resolving 34% of test instances. This\u0000performance places SuperCoder2.0 fourth globally on the SWE-bench leaderboard.\u0000The system's ability to handle diverse repositories and problem types\u0000highlights its potential as a versatile tool for autonomous software\u0000development. Future work will focus on refining the code editing process and\u0000exploring advanced embedding models for improved natural language to code\u0000mapping.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henrik Kirchmann, Stephan A. Fahrenkrog-Petersen, Felix Mannhardt, Matthias Weidlich
{"title":"Control-flow Reconstruction Attacks on Business Process Models","authors":"Henrik Kirchmann, Stephan A. Fahrenkrog-Petersen, Felix Mannhardt, Matthias Weidlich","doi":"arxiv-2409.10986","DOIUrl":"https://doi.org/arxiv-2409.10986","url":null,"abstract":"Process models may be automatically generated from event logs that contain\u0000as-is data of a business process. While such models generalize over the\u0000control-flow of specific, recorded process executions, they are often also\u0000annotated with behavioural statistics, such as execution frequencies.Based\u0000thereon, once a model is published, certain insights about the original process\u0000executions may be reconstructed, so that an external party may extract\u0000confidential information about the business process. This work is the first to\u0000empirically investigate such reconstruction attempts based on process models.\u0000To this end, we propose different play-out strategies that reconstruct the\u0000control-flow from process trees, potentially exploiting frequency annotations.\u0000To assess the potential success of such reconstruction attacks on process\u0000models, and hence the risks imposed by publishing them, we compare the\u0000reconstructed process executions with those of the original log for several\u0000real-world datasets.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Empirical Study of Sensitive Information in Logs","authors":"Roozbeh Aghili, Heng Li, Foutse Khomh","doi":"arxiv-2409.11313","DOIUrl":"https://doi.org/arxiv-2409.11313","url":null,"abstract":"Software logs, generated during the runtime of software systems, are\u0000essential for various development and analysis activities, such as anomaly\u0000detection and failure diagnosis. However, the presence of sensitive information\u0000in these logs poses significant privacy concerns, particularly regarding\u0000Personally Identifiable Information (PII) and quasi-identifiers that could lead\u0000to re-identification risks. While general data privacy has been extensively\u0000studied, the specific domain of privacy in software logs remains underexplored,\u0000with inconsistent definitions of sensitivity and a lack of standardized\u0000guidelines for anonymization. To mitigate this gap, this study offers a\u0000comprehensive analysis of privacy in software logs from multiple perspectives.\u0000We start by performing an analysis of 25 publicly available log datasets to\u0000identify potentially sensitive attributes. Based on the result of this step, we\u0000focus on three perspectives: privacy regulations, research literature, and\u0000industry practices. We first analyze key data privacy regulations, such as the\u0000General Data Protection Regulation (GDPR) and the California Consumer Privacy\u0000Act (CCPA), to understand the legal requirements concerning sensitive\u0000information in logs. Second, we conduct a systematic literature review to\u0000identify common privacy attributes and practices in log anonymization,\u0000revealing gaps in existing approaches. Finally, we survey 45 industry\u0000professionals to capture practical insights on log anonymization practices. Our\u0000findings shed light on various perspectives of log privacy and reveal industry\u0000challenges, such as technical and efficiency issues while highlighting the need\u0000for standardized guidelines. By combining insights from regulatory, academic,\u0000and industry perspectives, our study aims to provide a clearer framework for\u0000identifying and protecting sensitive information in software logs.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VulnLLMEval: A Framework for Evaluating Large Language Models in Software Vulnerability Detection and Patching","authors":"Arastoo Zibaeirad, Marco Vieira","doi":"arxiv-2409.10756","DOIUrl":"https://doi.org/arxiv-2409.10756","url":null,"abstract":"Large Language Models (LLMs) have shown promise in tasks like code\u0000translation, prompting interest in their potential for automating software\u0000vulnerability detection (SVD) and patching (SVP). To further research in this\u0000area, establishing a benchmark is essential for evaluating the strengths and\u0000limitations of LLMs in these tasks. Despite their capabilities, questions\u0000remain regarding whether LLMs can accurately analyze complex vulnerabilities\u0000and generate appropriate patches. This paper introduces VulnLLMEval, a\u0000framework designed to assess the performance of LLMs in identifying and\u0000patching vulnerabilities in C code. Our study includes 307 real-world\u0000vulnerabilities extracted from the Linux kernel, creating a well-curated\u0000dataset that includes both vulnerable and patched code. This dataset, based on\u0000real-world code, provides a diverse and representative testbed for evaluating\u0000LLM performance in SVD and SVP tasks, offering a robust foundation for rigorous\u0000assessment. Our results reveal that LLMs often struggle with distinguishing\u0000between vulnerable and patched code. Furthermore, in SVP tasks, these models\u0000tend to oversimplify the code, producing solutions that may not be directly\u0000usable without further refinement.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models","authors":"Shaznin Sultana, Sadia Afreen, Nasir U. Eisty","doi":"arxiv-2409.10490","DOIUrl":"https://doi.org/arxiv-2409.10490","url":null,"abstract":"The growing trend of vulnerability issues in software development as a result\u0000of a large dependence on open-source projects has received considerable\u0000attention recently. This paper investigates the effectiveness of Large Language\u0000Models (LLMs) in identifying vulnerabilities within codebases, with a focus on\u0000the latest advancements in LLM technology. Through a comparative analysis, we\u0000assess the performance of emerging LLMs, specifically Llama, CodeLlama, Gemma,\u0000and CodeGemma, alongside established state-of-the-art models such as BERT,\u0000RoBERTa, and GPT-3. Our study aims to shed light on the capabilities of LLMs in\u0000vulnerability detection, contributing to the enhancement of software security\u0000practices across diverse open-source repositories. We observe that CodeGemma\u0000achieves the highest F1-score of 58 and a Recall of 87, amongst the recent\u0000additions of large language models to detect software security vulnerabilities.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Confidence in Assurance 2.0 Cases","authors":"Robin Bloomfield, John Rushby","doi":"arxiv-2409.10665","DOIUrl":"https://doi.org/arxiv-2409.10665","url":null,"abstract":"An assurance case should provide justifiable confidence in the truth of a\u0000claim about some critical property of a system or procedure, such as safety or\u0000security. We consider how confidence can be assessed in the rigorous approach\u0000we call Assurance 2.0. Our goal is indefeasible confidence and we approach it from four different\u0000perspectives: logical soundness, probabilistic assessment, dialectical\u0000examination, and residual risks.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}