{"title":"An Empirical Study of Challenges in Converting Deep Learning Models","authors":"Moses Openja, Amin Nikanjam, Ahmed Haj Yahmed, Foutse Khomh, Zhengyong Jiang","doi":"10.1109/ICSME55016.2022.00010","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00010","url":null,"abstract":"There is an increase in deploying Deep Learning (DL)-based software systems in real-world applications. Usually, DL models are developed and trained using DL frameworks like TensorFlow and PyTorch. Each framework has its own internal mechanisms/formats to represent and train DL models (deep neural networks), and usually those formats cannot be recognized by other frameworks. Moreover, trained models are usually deployed in environments different from where they were developed. To solve the interoperability issue and make DL models compatible with different frameworks/environments, some exchange formats are introduced for DL models, like ONNX and CoreML. However, ONNX and CoreML were never empirically evaluated by the community to reveal their prediction accuracy, performance, and robustness after conversion. Poor accuracy or non-robust behavior of converted models may lead to poor quality of deployed DL-based software systems. We conduct, in this paper, the first empirical study to assess ONNX and CoreML for converting trained DL models. In our systematic approach, two popular DL frameworks, Keras and PyTorch, are used to train five widely used DL models on three popular datasets. The trained models are then converted to ONNX and CoreML and transferred to two runtime environments designated for such formats, to be evaluated. We investigate the prediction accuracy before and after conversion. Our results unveil that the prediction accuracy of converted models are at the same level of originals. The performance (time cost and memory consumption) of converted models are studied as well. The size of models are reduced after conversion, which can result in optimized DL-based software deployment. We also study the adversarial robustness of converted models to make sure about the robustness of deployed DL-based software. Leveraging the state-of-the-art adversarial attack approaches, converted models are generally assessed robust at the same level of originals. However, obtained results show that CoreML models are more vulnerable to adversarial attacks compared to ONNX. The general message of our findings is that DL developers should be cautious on the deployment of converted models that may 1) perform poorly while switching from one framework to another, 2) have challenges in robust deployment, or 3) run slowly, leading to poor quality of deployed DL-based software, including DL-based software maintenance tasks, like bug prediction.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"239 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134415573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chi Yu, Guang Yang, Xiang Chen, Ke Liu, Yanlin Zhou
{"title":"BashExplainer: Retrieval-Augmented Bash Code Comment Generation based on Fine-tuned CodeBERT","authors":"Chi Yu, Guang Yang, Xiang Chen, Ke Liu, Yanlin Zhou","doi":"10.1109/ICSME55016.2022.00016","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00016","url":null,"abstract":"Developers use shell commands for many tasks, such as file system management, network control, and process management. Bash is one of the most commonly used shells and plays an important role in Linux system development and maintenance. Due to the language flexibility of Bash code, developers who are not familiar with Bash often have difficulty understanding the purpose and functionality of Bash code. In this study, we study Bash code comment generation problem and proposed an automatic method BASHEXPLAINER based on two-stage training strategy. In the first stage, we train a Bash encoder by fine-tuning CodeBERT on our constructed Bash code corpus. In the second stage, we first retrieve the most similar code from the code repository for the target code based on semantic and lexical similarity. Then we use the trained Bash encoder to generate two vector representations. Finally, we fuse these two vector representations via the fusion layer and generate the code comment through the decoder. To show the competitiveness of our proposed method, we construct a high-quality corpus by combining the corpus shared in the previous NL2Bash study and the corpus shared in the NLC2CMD competition. This corpus contains 10,592 Bash codes and corresponding comments. Then we selected ten baselines from previous studies on automatic code comment generation, which cover information retrieval methods, deep learning methods, and hybrid methods. The experimental results show that in terms of the performance measures BLEU-3/4, METEOR, and ROUGR-L, BASHEXPLAINER can outperform all baselines by at least 8.75%, 9.29%, 4.77% and 3.86%. Then we design ablation experiments to show the component setting rationality of BASHEXPLAINER. Later, we conduct a human study to further show the competitiveness of BASHEXPLAINER. Finally, we develop a browser plug-in based on BASHEXPLAINER to facilitate the understanding of the Bash code for developers.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1710 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134506439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Irsan, Ting Zhang, Ferdian Thung, David Lo, Lingxiao Jiang
{"title":"AutoPRTitle: A Tool for Automatic Pull Request Title Generation","authors":"I. Irsan, Ting Zhang, Ferdian Thung, David Lo, Lingxiao Jiang","doi":"10.1109/ICSME55016.2022.00058","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00058","url":null,"abstract":"With the rise of the pull request mechanism in software development, the quality of pull requests has gained more attention. Prior works focus on improving the quality of pull request descriptions and several approaches have been proposed to automatically generate pull request descriptions. As an essential component of a pull request, pull request titles have not received a similar level of attention. To further facilitate automation in software development and to help developers draft high-quality pull request titles, we introduce AutoPRTitle. AutoPRTitle is specifically designed to generate pull request titles automatically. AutoPRTitle can generate a precise and succinct pull request title based on the pull request description, commit messages, and the associated issue titles. AutoPRTitle is built upon a state-of-the-art text summarization model, BART, which has been pre-trained on large-scale English corpora. We further fine-tuned BART in a pull request dataset containing high-quality pull request titles. We implemented AutoPRTitle as a stand-alone web application. We conducted two sets of evaluations: one concerning the model accuracy and the other concerning the tool usability. For model accuracy, BART outperforms the best baseline by 24.6%, 40.5%, and 23.3%, respectively. For tool usability, the evaluators consider our tool as easy-to-use and useful when creating a pull request title of good quality.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126828103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gunnar Kudrjavets, Jeff Thomas, Aditya Kumar, Nachiappan Nagappan, Ayushi Rastogi
{"title":"There Ain’t No Such Thing as a Free Custom Memory Allocator","authors":"Gunnar Kudrjavets, Jeff Thomas, Aditya Kumar, Nachiappan Nagappan, Ayushi Rastogi","doi":"10.1109/ICSME55016.2022.00079","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00079","url":null,"abstract":"Using custom memory allocators is an efficient performance optimization technique. However, dependency on a custom allocator can introduce several maintenance-related issues. We present lessons learned from the industry and provide critical guidance for using custom memory allocators and enumerate various challenges associated with integrating them. These recommendations are based on years of experience incorporating custom allocators into different industrial software projects.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125663399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ting Zhang, I. Irsan, Ferdian Thung, Donggyun Han, David Lo, Lingxiao Jiang
{"title":"Automatic Pull Request Title Generation","authors":"Ting Zhang, I. Irsan, Ferdian Thung, Donggyun Han, David Lo, Lingxiao Jiang","doi":"10.1109/ICSME55016.2022.00015","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00015","url":null,"abstract":"Pull Requests (PRs) are a mechanism on modern collaborative coding platforms, such as GitHub. PRs allow developers to tell others that their code changes are available for merging into another branch in a repository. A PR needs to be reviewed and approved by the core team of the repository before the changes are merged into the branch. Usually, reviewers need to identify a PR that is in line with their interests before providing a review. By default, PRs are arranged in a list view that shows the titles of PRs. Therefore, it is desirable to have a precise and concise title, which is beneficial for both reviewers and other developers. However, it is often the case that developers do not provide good titles; we find that many existing PR titles are either inappropriate in length (i.e., too short or too long) or fail to convey useful information, which may result in PR being ignored or rejected. Therefore, there is a need for automatic techniques to help developers draft high-quality titles.In this paper, we introduce the task of automatic generation of PR titles. We formulate the task as a one-sentence summarization task. To facilitate the research on this task, we construct a dataset that consists of 43,816 PRs from 495 GitHub repositories. We evaluated the state-of-the-art summarization approaches for the automatic PR title generation task. We leverage ROUGE metrics to automatically evaluate the summarization approaches and conduct a manual evaluation. The experimental results indicate that BART is the best technique for generating satisfactory PR titles with ROUGE-1, ROUGE-2, and ROUGE-L F1-scores of 47.22, 25.27, and 43.12, respectively. The manual evaluation also shows that the titles generated by BART are preferred.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128853493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dario Amoroso d'Aragona, Fabiano Pecorelli, Simone Romano, G. Scanniello, M. T. Baldassarre, Andrea Janes, Valentina Lenarduzzi
{"title":"CATTO: Just-in-time Test Case Selection and Execution","authors":"Dario Amoroso d'Aragona, Fabiano Pecorelli, Simone Romano, G. Scanniello, M. T. Baldassarre, Andrea Janes, Valentina Lenarduzzi","doi":"10.1109/ICSME55016.2022.00059","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00059","url":null,"abstract":"Regression testing wants to prevent that errors, which have already been corrected once, creep back into a system that has been updated. A naïve approach consists of re-running the entire test suite (TS) against the changed version of the software under test (SUT). However, this might result in a time-and resource-consuming process; e.g., when dealing with large and/or complex SUTs and TSs. To avoid this problem, Test Case Selection (TCS) approaches can be used. This kind of approaches build a temporary TS comprising only those test cases (TCs) that are relevant to the changes made to the SUT, so avoiding executing unnecessary TCs. In this paper, we introduce CATTO (Commit Adaptive Tool for Test suite Optimization), a tool implementing a TCS strategy for SUTs written in Java as well as a wrapper to allow developers to use CATTO within IntelliJ IDEA and to execute CATTO just-in-time before committing changes to the repository. We conducted a preliminary evaluation of CATTO on seven open-source Java projects to evaluate the reduction of the test-suite size, the loss of fault-revealing TCs, and the loss of fault-detection capability. The results suggest that CATTO can be of help to developers when performing TCS. The video demo and the documentation of the tool is available at: https://catto-tool.github.io/","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126151606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dylan T. Lee, Austin Z. Henley, B. Hinshaw, Rahul Pandita
{"title":"OpenCBS: An Open-Source COBOL Defects Benchmark Suite","authors":"Dylan T. Lee, Austin Z. Henley, B. Hinshaw, Rahul Pandita","doi":"10.1109/ICSME55016.2022.00030","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00030","url":null,"abstract":"As the current COBOL workforce retires, entry-level developers are left to keep complex legacy systems maintained and operational. This creates a massive gap in knowledge and ability as companies are having their veteran developers replaced with a new, inexperienced workforce. Additionally, the lack of COBOL and mainframe technology in the current academic curriculum further increases the learning curve for this new generation of developers. These issues are becoming even more pressing due to the business-critical nature of these systems, which makes migrating or replacing the mainframe and COBOL unlikely anytime soon. As a result, there is now a huge need for tools and resources to increase new developers’ code comprehension and ability to perform routine tasks such as debugging and defect location. Extensive work has been done in the software engineering field on the creation of such resources. However, the proprietary nature of COBOL and mainframe systems has restricted the amount of work and the number of open-source tools available for this domain. To address this issue, our work leverages the publicly available technical forum data to build an open-source collection of COBOL programs embodying issues/defects faced by COBOL developers. These programs were reconstructed and organized in a benchmark suite to facilitate the testing of developer tools. Our goal is to provide an open-source COBOL benchmark and testing suite that encourage community contribution and serve as a resource for researchers and tool-smiths in this domain.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133717160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gunnar Kudrjavets, Jeff Thomas, Nachiappan Nagappan, Ayushi Rastogi
{"title":"Is Kernel Code Different From Non-Kernel Code? A Case Study of BSD Family Operating Systems","authors":"Gunnar Kudrjavets, Jeff Thomas, Nachiappan Nagappan, Ayushi Rastogi","doi":"10.1109/ICSME55016.2022.00027","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00027","url":null,"abstract":"Studies on software evolution explore code churn and code velocity at the abstraction level of a company or an entire project. We argue that this approach misses the differences among abstractions layers and subsystems of large projects. We conduct a case study on four BSD family operating systems: DragonFlyBSD, FreeBSD, NetBSD, and OpenBSD, to investigate the evolution of code churn and code velocity across kernel and non-kernel code. We mine commits for characteristics such as annual growth rate, commit types, change type ratio, and size taxonomy, indicating code churn. Likewise, we investigate code velocity in terms of code review periods, i.e., time-to-first-response, time-to-accept, and time-to-merge.Our study provides evidence that software evolves differently at abstraction layers: kernel and non-kernel. The study finds similarities in the code base growth rate and distribution of commit types (neutral, additive, and subtractive) across BSD subsystems, however, (a) most commits contain either kernel or non-kernel code, (b) kernel commits are larger than non-kernel commits, and (c) code reviews for kernel code take longer than non-kernel code.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129928097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}