{"title":"Do Code Summarization Models Process Too Much Information? Function Signature May Be All What Is Needed","authors":"Xi Ding, Rui Peng, Xiangping Chen, Yuan Huang, Jing Bian, Zibin Zheng","doi":"10.1145/3652156","DOIUrl":"https://doi.org/10.1145/3652156","url":null,"abstract":"<p>With the fast development of large software projects, automatic code summarization techniques, which summarize the main functionalities of a piece of code using natural languages as comments, play essential roles in helping developers understand and maintain large software projects. Many research efforts have been devoted to building automatic code summarization approaches. Typical code summarization approaches are based on deep learning models. They transform the task into a sequence-to-sequence task, which inputs source code and outputs summarizations in natural languages. All code summarization models impose different input size limits, such as 50 to 10,000, for the input source code. However, how the input size limit affects the performance of code summarization models still remains under-explored. In this paper, we first conduct an empirical study to investigate the impacts of different input size limits on the quality of generated code comments. To our surprise, experiments on multiple models and datasets reveal that setting a low input size limit, such as 20, does not necessarily reduce the quality of generated comments. </p><p>Based on this finding, we further propose to use function signatures instead of full source code to summarize the main functionalities first and then input the function signatures into code summarization models. Experiments and statistical results show that inputs with signatures are, on average, more than 2 percentage points better than inputs without signatures and thus demonstrate the effectiveness of involving function signatures in code summarization. We also invite programmers to do a questionnaire to evaluate the quality of code summaries generated by two inputs with different truncation levels. The results show that function signatures generate, on average, 9.2% more high-quality comments than full code.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"2 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140127520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced White-Box Heuristics for Search-Based Fuzzing of REST APIs","authors":"Andrea Arcuri, Man Zhang, Juan Pablo Galeotti","doi":"10.1145/3652157","DOIUrl":"https://doi.org/10.1145/3652157","url":null,"abstract":"<p>Due to its importance and widespread use in industry, automated testing of REST APIs has attracted major interest from the research community in the last few years. However, most of the work in the literature has been focused on black-box fuzzing. Although existing fuzzers have been used to automatically find many faults in existing APIs, there are still several open research challenges that hinder the achievement of better results (e.g., in terms of code coverage and fault finding). For example, under-specified schemas are a major issue for black-box fuzzers. Currently, <span>EvoMaster</span> is the only existing tool that supports white-box fuzzing of REST APIs. In this paper, we provide a series of novel white-box heuristics, including for example how to deal with under-specified constrains in API schemas, as well as under-specified schemas in SQL databases. Our novel techniques are implemented as an extension to our open-source, search-based fuzzer <span>EvoMaster</span>. An empirical study on 14 APIs from the EMB corpus, plus one industrial API, shows clear improvements of the results in some of these APIs.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"89 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140098151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reducing the Impact of Time Evolution on Source Code Authorship Attribution via Domain Adaptation","authors":"Zhen Li, Shasha Zhao, Chen Chen, Qian Chen","doi":"10.1145/3652151","DOIUrl":"https://doi.org/10.1145/3652151","url":null,"abstract":"<p>Source code authorship attribution is an important problem in practical applications such as plagiarism detection, software forensics, and copyright disputes. Recent studies show that existing methods for source code authorship attribution can be significantly affected by time evolution, leading to a decrease in attribution accuracy year by year. To alleviate the problem that Deep Learning (DL)-based source code authorship attribution degrading in accuracy due to time evolution, we propose a new framework called <underline>Time</underline> <underline>D</underline>omain <underline>A</underline>daptation (TimeDA) by adding new feature extractors to the original DL-based code attribution framework that enhances the learning ability of the original model on source domain features without requiring new or more source data. Moreover, we employ a centroid-based pseudo-labeling strategy using neighborhood clustering entropy for adaptive learning to improve the robustness of DL-based code authorship attribution. Experimental results show that TimeDA can significantly enhance the robustness of DL-based source code authorship attribution to time evolution, with an average improvement of 8.7% on the Java dataset and 5.2% on the C++ dataset. In addition, our TimeDA benefits from employing the centroid-based pseudo-labeling strategy, which significantly reduced the model training time by 87.3% compared to traditional unsupervised domain adaptive methods.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"89 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140098252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenpeng Chen, Jie M. Zhang, Max Hort, Mark Harman, Federica Sarro
{"title":"Fairness Testing: A Comprehensive Survey and Analysis of Trends","authors":"Zhenpeng Chen, Jie M. Zhang, Max Hort, Mark Harman, Federica Sarro","doi":"10.1145/3652155","DOIUrl":"https://doi.org/10.1145/3652155","url":null,"abstract":"<p>Unfair behaviors of Machine Learning (ML) software have garnered increasing attention and concern among software engineers. To tackle this issue, extensive research has been dedicated to conducting fairness testing of ML software, and this paper offers a comprehensive survey of existing studies in this field. We collect 100 papers and organize them based on the testing workflow (i.e., how to test) and testing components (i.e., what to test). Furthermore, we analyze the research focus, trends, and promising directions in the realm of fairness testing. We also identify widely-adopted datasets and open-source tools for fairness testing.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"87 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140098256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generating Python Type Annotations from Type Inference: How Far Are We?","authors":"Yimeng Guo, Zhifei Chen, Lin Chen, Wenjie Xu, Yanhui Li, Yuming Zhou, Baowen Xu","doi":"10.1145/3652153","DOIUrl":"https://doi.org/10.1145/3652153","url":null,"abstract":"<p>In recent years, dynamic languages such as Python have become popular due to their flexibility and productivity. The lack of static typing makes programs face the challenges of fixing type errors, early bug detection, and code understanding. To alleviate these issues, PEP 484 introduced optional type annotations for Python in 2014, but unfortunately, a large number of programs are still not annotated by developers. Annotation generation tools can utilize type inference techniques. However, several important aspects of type annotation generation are overlooked by existing works, such as in-depth effectiveness analysis, potential improvement exploration, and practicality evaluation. And it is unclear how far we have been and how far we can go. </p><p>In this paper, we set out to comprehensively investigate the effectiveness of type inference tools for generating type annotations, applying three categories of state-of-the-art tools on a carefully-cleaned dataset. First, we use a comprehensive set of metrics and categories, finding that existing tools have different effectiveness and cannot achieve both high accuracy and high coverage. Then, we summarize six patterns to present the limitations in type annotation generation. Next, we implement a simple but effective tool to demonstrate that existing tools can be improved in practice. Finally, we conduct a controlled experiment showing that existing tools can reduce the time spent annotating types and determine more precise types, but cannot reduce subjective difficulty. Our findings point out the limitations and improvement directions in type annotation generation, which can inspire future work.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"51 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140098355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stefanie Betz, Birgit Penzenstadler, Leticia Duboc, Ruzanna Chitchyan, Sedef Akinli Kocak, Ian Brooks, Shola Oyedeji, Jari Porras, Norbert Seyff, Colin C. Venters
{"title":"Lessons Learned from Developing a Sustainability Awareness Framework for Software Engineering using Design Science","authors":"Stefanie Betz, Birgit Penzenstadler, Leticia Duboc, Ruzanna Chitchyan, Sedef Akinli Kocak, Ian Brooks, Shola Oyedeji, Jari Porras, Norbert Seyff, Colin C. Venters","doi":"10.1145/3649597","DOIUrl":"https://doi.org/10.1145/3649597","url":null,"abstract":"<p><b>[Context and Motivation]</b> To foster a sustainable society within a sustainable environment, we must dramatically reshape our work and consumption activities, most of which are facilitated through software. Yet, most software engineers hardly consider the effects on the sustainability of the IT products and services they deliver. This issue is exacerbated by a lack of methods and tools for this purpose. </p><p><b>[Question/Problem]</b> Despite the practical need for methods and tools that explicitly support consideration of the effects that IT products and services have on the sustainability of their intended environments, such methods and tools remain largely unavailable. Thus, urgent research is needed to understand how to design such tools for the IT community properly. </p><p><b>[Principal Ideas/Results]</b> In this paper, we describe our experience using design science to create the Sustainability Awareness Framework (SusAF), which supports software engineers in anticipating and mitigating the potential sustainability effects during system development. More specifically, we identify and present the challenges faced during this process. </p><p><b>[Contribution]</b> The challenges that we have faced and addressed in the development of the SusAF are likely to be relevant to others who aim to create methods and tools to integrate sustainability analysis into their IT Products and Service development. Thus, the lessons learned in SusAF development are shared for the benefit of researchers and other professionals who design tools for that end.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"25 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140069865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daan Hommersom, Antonino Sabetta, Bonaventura Coppola, Dario Di Nucci, Damian A. Tamburri
{"title":"Automated Mapping of Vulnerability Advisories onto their Fix Commits in Open Source Repositories","authors":"Daan Hommersom, Antonino Sabetta, Bonaventura Coppola, Dario Di Nucci, Damian A. Tamburri","doi":"10.1145/3649590","DOIUrl":"https://doi.org/10.1145/3649590","url":null,"abstract":"<p>The lack of comprehensive sources of accurate vulnerability data represents a critical obstacle to studying and understanding software vulnerabilities (and their corrections). In this paper, we present an approach that combines heuristics stemming from practical experience and machine-learning (ML)—specifically, natural language processing (NLP)—to address this problem. Our method consists of three phases. First, we construct an <i>advisory record</i>\u0000object containing key information about a vulnerability that is extracted from an advisory, such those found in the National Vulnerability Database (NVD). These advisories are expressed in natural language. Second, using heuristics, a subset of candidate fix commits is obtained from the source code repository of the affected project, by filtering out commits that can be identified as unrelated to the vulnerability at hand. Finally, for each of the remaining candidate commits, our method builds a numerical feature vector reflecting the characteristics of the commit that are relevant to predicting its match with the advisory at hand. Based on the values of these feature vectors, our method produces a ranked list of candidate fixing commits. The score attributed by the ML model to each feature is kept visible to the users, allowing them to easily interpret the predictions. </p><p>We implemented our approach and we evaluated it on an open data set, built by manual curation, that comprises 2,391 known fix commits corresponding to 1,248 public vulnerability advisories. When considering the top-10 commits in the ranked results, our implementation could successfully identify at least one fix commit for up to 84.03% of the vulnerabilities (with a fix commit on the first position for 65.06% of the vulnerabilities). Our evaluation shows that our method can reduce considerably the manual effort needed to search OSS repositories for the commits that fix known vulnerabilities.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"69 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140033768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martin P. Robillard, Deeksha M. Arya, Neil A. Ernst, Jin L.C. Guo, Maxime Lamothe, Mathieu Nassif, Nicole Novielli, Alexander Serebrenik, Igor Steinmacher, Klaas-Jan Stol
{"title":"Communicating Study Design Trade-offs in Software Engineering","authors":"Martin P. Robillard, Deeksha M. Arya, Neil A. Ernst, Jin L.C. Guo, Maxime Lamothe, Mathieu Nassif, Nicole Novielli, Alexander Serebrenik, Igor Steinmacher, Klaas-Jan Stol","doi":"10.1145/3649598","DOIUrl":"https://doi.org/10.1145/3649598","url":null,"abstract":"<p>Reflecting on the limitations of a study is a crucial part of the research process. In software engineering studies, this reflection is typically conveyed through discussions of study limitations or threats to validity. In current practice, such discussions seldom provide sufficient insight to understand the rationale for decisions taken before and during the study, and their implications. We revisit the practice of discussing study limitations and threats to validity and identify its weaknesses. We propose to refocus this practice of self-reflection to a discussion centered on the notion of <i>trade-offs</i>. We argue that documenting trade-offs allows researchers to clarify how the benefits of their study design decisions outweigh the costs of possible alternatives. We present guidelines for reporting trade-offs in a way that promotes a fair and dispassionate assessment of researchers’ work.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"33 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140019832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fine-Grained Coverage-Based Fuzzing - RCR Report","authors":"Wei-Cheng Wu, Bernard Nongpoh, Marwan Nour, Michaël Marcozzi, Sébastien Bardin, Christophe Hauser","doi":"10.1145/3649592","DOIUrl":"https://doi.org/10.1145/3649592","url":null,"abstract":"<p>This is the RCR report of the artifact for the paper ”Fine-Grained Coverage-Based Fuzzing”. The attached zip file contains scripts and pre-build binary programs to reproduce the results presented in the main paper. The artifact is released on Zenodo with DOI: 10.5281/zenodo.7275184. We claim the artifact to be available, functional and reusable. The technology skills needed to review the artifact is know to use Linux/Unix terminal and basic understanding of Docker.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"256 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139977860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Precisely Extracting Complex Variable Values from Android Apps","authors":"Marc Miltenberger, Steven Arzt","doi":"10.1145/3649591","DOIUrl":"https://doi.org/10.1145/3649591","url":null,"abstract":"<p>Millions of users nowadays rely on their smartphones to process sensitive data through apps from various vendors and sources. Therefore, it is vital to assess these apps for security vulnerabilities and privacy violations. Information such as to which server an app connects through which protocol, and which algorithm it applies for encryption are usually encoded as variable values and arguments of API calls. However, extracting these values from an app is not trivial. The source code of an app is usually not available, and manual reverse engineering is cumbersome with binary sizes in the tens of megabytes. Current automated tools, on the other hand, cannot retrieve values that are computed at runtime through complex transformations. </p><p>In this paper, we present <span>ValDroid</span>, a novel static analysis tool for automatically extracting the set of possible values for a given variable at a given statement in the Dalvik byte code of an Android app. We evaluate <span>ValDroid</span>\u0000against existing approaches (JSA, Violist, DroidRA, Harvester, BlueSeal, StringHound, IC3, COAL) on benchmarks and 794 real-world apps. <span>ValDroid</span> greatly outperforms existing tools. It provides an average <i>F</i><sub>1</sub> score of more than 90%, while only requiring 0.1 seconds per value on average. For many data types including Network Connections and Dynamic Code Loading, its recall is more than twice the recall of the best existing approaches.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"5 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139978081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}