2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)最新文献_第3页

WebEV: A Dataset on the Behavior of Testers for Web Application End to End Testing Web应用程序端到端测试的测试人员行为数据集

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-05-01 DOI: 10.1109/ICPC58990.2023.00022

M. Fuad, K. Sakib

{"title":"WebEV: A Dataset on the Behavior of Testers for Web Application End to End Testing","authors":"M. Fuad, K. Sakib","doi":"10.1109/ICPC58990.2023.00022","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00022","url":null,"abstract":"Automated End-to-End (E2E) web testing is a key component in modern rapid development to validate system functionality. However, there are no resources supporting practitioners on how diverse scenarios are tested manually. This paper presents WebEV, a dataset containing E2E test cases from open-source popular projects. Projects are selected based on - i) Cypress-based automation, ii) popularity on GitHub and iii) executability of test cases. The dataset contains information regarding each test command along with the incurred state change representation. Snapshots of the application are used to retrieve - i) the current URL of the application, ii) the screenshot and HTML text of the entire page, and iii) the screenshot and HTML text of an operated UI element. This process is done both before and after each command execution to capture the perception of testers on each state transition, i.e., extract their thought process during testing. This dataset can assist the research community to model user web interaction, predicting the tester’s perception, and improving the state of automated testing approaches. Moreover, WebEV can be used to mine how automated approaches differ from real-life E2E test scenarios.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123871717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Naturalness in Source Code Summarization. How Significant is it? 源代码摘要的自然性。它有多重要?

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-05-01 DOI: 10.1109/ICPC58990.2023.00027

C. Ferretti, Martina Saletta

{"title":"Naturalness in Source Code Summarization. How Significant is it?","authors":"C. Ferretti, Martina Saletta","doi":"10.1109/ICPC58990.2023.00027","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00027","url":null,"abstract":"Research in source code summarization, that is the description of the functionality of a program with short sentences expressed in natural language, is a topic of great interest in the software engineering community, since it can help in automatically generating software documentation, and in general can ease the effort of the developers in understanding the code they are working on. In this work, which is conceived as a negative results paper, we study the existing neural models designed for this purpose, pointing out their high sensitivity to the natural elements present in the source code (i.e. comments and identifiers) and the related drop in performance when such elements are ablated or masked. We then propose a novel source code summarization approach based on the aid of an intermediate pseudo-language, through which we are able to fine-tune the BRIO model for natural language on source code summarization, and to achieve results comparable to that obtained by the state-of-the-art source code competitors (e.g. PLBART and CodeBERT). We finally discuss about the limitations of these NLP-based approaches when transferred in the domain of source code processing, and we provide some insights for further research directions.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122694709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards a Classification of Log Parsing Errors 日志解析错误分类研究

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-05-01 DOI: 10.1109/ICPC58990.2023.00023

Issam Sedki, A. Hamou-Lhadj, O. Mohamed, Naser Ezzati-Jivan

引用次数: 1

Evaluating a Language Workbench: from Working Memory Capacity to Comprehension to Acceptance 评估语言工作台:从工作记忆容量到理解再到接受

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-05-01 DOI: 10.1109/ICPC58990.2023.00017

Giovanna Broccia, Alessio Ferrari, M. T. Beek, W. Cazzola, L. Favalli, Francesco Bertolotti

{"title":"Evaluating a Language Workbench: from Working Memory Capacity to Comprehension to Acceptance","authors":"Giovanna Broccia, Alessio Ferrari, M. T. Beek, W. Cazzola, L. Favalli, Francesco Bertolotti","doi":"10.1109/ICPC58990.2023.00017","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00017","url":null,"abstract":"Language workbenches are tools that enable the definition, reuse and composition of programming languages and their ecosystem. This breed of frameworks aims to make the development of new languages easier and more affordable. Consequently, the comprehensibility of the language used in a language workbench (i.e., the meta-language) should be an important aspect to consider and evaluate. To the best of our knowledge, although the quantitative aspects of language workbenches are often discussed in the literature, the evaluation of comprehensibility is typically neglected.Neverlang is a language workbench that enables the definition of languages with a modular approach. This paper presents a preliminary study that intends to assess the comprehensibility of Neverlang programs, evaluated in terms of users’ effectiveness and efficiency in a code comprehension task. The study also investigates the relationship between Neverlang comprehensibility and the users’ working memory capacity. Furthermore, we intend to capture the relationship between Neverlang comprehensibility and users’ acceptance, in terms of perceived ease of use, perceived usefulness, and intention to use. Our preliminary results on 10 subjects suggest that the users’ working memory capacity may be related to the ability to comprehend Neverlang programs. On the other hand, effectiveness and efficiency do not appear to be associated with an increase in users’ acceptance variables.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114226709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Revisiting Lightweight Compiler Provenance Recovery on ARM Binaries 重新审视ARM二进制文件的轻量级编译器来源恢复

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-05-01 DOI: 10.1109/ICPC58990.2023.00044

Jason Kim, Daniel Genkin, Kevin Leach

{"title":"Revisiting Lightweight Compiler Provenance Recovery on ARM Binaries","authors":"Jason Kim, Daniel Genkin, Kevin Leach","doi":"10.1109/ICPC58990.2023.00044","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00044","url":null,"abstract":"A binary’s behavior is greatly influenced by how the compiler builds its source code. Although most compiler configuration details are abstracted away during compilation, recovering them is useful for reverse engineering and program comprehension tasks on unknown binaries, such as code similarity detection. We observe that previous work has thoroughly explored this on x86-64 binaries. However, there has been limited investigation of ARM binaries, which are increasingly prevalent.In this paper, we extend previous work with a shallow-learning model that efficiently and accurately recovers compiler configuration properties for ARM binaries. We apply opcode and register-derived features, that have previously been effective on x86-64 binaries, to ARM binaries. Furthermore, we compare this work with Pizzolotto et al., a recent architecture-agnostic model that uses deep learning, whose dataset and code are available.We observe that the lightweight features are reproducible on ARM binaries. We achieve over 99% accuracy, on par with state-of-the-art deep learning approaches, while achieving a 583-times speedup during training and 3,826-times speedup during inference. Finally, we also discuss findings of overfitting that was previously undetected in prior work.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122571997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Properly Offer Options to Improve the Practicality of Software Document Completion Tools 适当提供选项以提高软件文档补全工具的实用性

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-05-01 DOI: 10.1109/ICPC58990.2023.00038

Zhipeng Cai, Songqiang Chen, Xiaoyuan Xie

{"title":"Properly Offer Options to Improve the Practicality of Software Document Completion Tools","authors":"Zhipeng Cai, Songqiang Chen, Xiaoyuan Xie","doi":"10.1109/ICPC58990.2023.00038","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00038","url":null,"abstract":"With the great progress in deep learning and natural language processing, many completion tools are proposed to help practitioners efficiently fill in various fields in software document. However, most of these tools offer their users only one option and this option generally requires much revision to meet a satisfactory quality, which hurts much practicality of the completion tools. By finding that the beam search model of such tools often generates a much better output at relatively high confidence and considering the interactive use of such tools, we advise such tools to offer multiple high-confidence model outputs for more chances of offering a good option. And we further suggest these tools offer dissimilar outputs to expand the chance of including a better output in a few options. To evaluate our whole idea, we design a clustering-based initial method to help these tools properly offer some dissimilar model outputs as options. We adopt this method to improve nine completion tools for three software document fields. Results show it can help all the nine tools offer an option that needs less revision from users and thus effectively improve the practicality of tools.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128667210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Revisiting Deep Learning for Variable Type Recovery 重新审视变量类型恢复的深度学习

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-04-07 DOI: 10.1109/ICPC58990.2023.00042

Kevin Cao, Kevin Leach

{"title":"Revisiting Deep Learning for Variable Type Recovery","authors":"Kevin Cao, Kevin Leach","doi":"10.1109/ICPC58990.2023.00042","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00042","url":null,"abstract":"Compiled binary executables are often the only available artifact in reverse engineering, malware analysis, and software systems maintenance. Unfortunately, the lack of semantic information like variable types makes comprehending binaries difficult. In efforts to improve the comprehensibility of binaries, researchers have recently used machine learning techniques to predict semantic information contained in the original source code. Chen et al. implemented DIRTY, a Transformer-based Encoder-Decoder architecture capable of augmenting decompiled code with variable names and types by leveraging decompiler output tokens and variable size information. Chen et al. were able to demonstrate a substantial increase in name and type extraction accuracy on Hex-Rays decompiler outputs compared to existing static analysis and AI-based techniques. We extend the original DIRTY results by re-training the DIRTY model on a dataset produced by the open-source Ghidra decompiler. Although Chen et al. concluded that Ghidra was not a suitable decompiler candidate due to its difficulty in parsing and incorporating DWARF symbols during analysis, we demonstrate that straightforward parsing of variable data generated by Ghidra results in similar retyping performance. We hope this work inspires further interest and adoption of the Ghidra decompiler for use in research projects.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128540639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Label Smoothing Improves Neural Source Code Summarization 标签平滑改进神经系统源代码摘要

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-03-28 DOI: 10.1109/ICPC58990.2023.00025

S. Haque, Aakash Bansal, Collin McMillan

{"title":"Label Smoothing Improves Neural Source Code Summarization","authors":"S. Haque, Aakash Bansal, Collin McMillan","doi":"10.1109/ICPC58990.2023.00025","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00025","url":null,"abstract":"Label smoothing is a regularization technique for neural networks. Normally neural models are trained to an output distribution that is a vector with a single 1 for the correct prediction, and 0 for all other elements. Label smoothing converts the correct prediction location to something slightly less than 1, then distributes the remainder to the other elements such that they are slightly greater than 0. A conceptual explanation behind label smoothing is that it helps prevent a neural model from becoming \"overconfident\" by forcing it to consider alternatives, even if only slightly. Label smoothing has been shown to help several areas of language generation, yet typically requires considerable tuning and testing to achieve the optimal results. This tuning and testing has not been reported for neural source code summarization – a growing research area in software engineering that seeks to generate natural language descriptions of source code behavior. In this paper, we demonstrate the effect of label smoothing on several baselines in neural code summarization, and conduct an experiment to find good parameters for label smoothing and make recommendations for its use.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129451633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ChameleonIDE: Untangling Type Errors Through Interactive Visualization and Exploration 变色龙ide:通过交互式可视化和探索解开类型错误

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-03-17 DOI: 10.1109/ICPC58990.2023.00029

Shuai Fu, Tim Dwyer, Peter J. Stuckey, Jackson Wain, Jesse Linossier

{"title":"ChameleonIDE: Untangling Type Errors Through Interactive Visualization and Exploration","authors":"Shuai Fu, Tim Dwyer, Peter J. Stuckey, Jackson Wain, Jesse Linossier","doi":"10.1109/ICPC58990.2023.00029","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00029","url":null,"abstract":"Dynamically typed programming languages are popular in education and the software industry. While presenting a low barrier to entry, they suffer from runtime type errors and longer-term problems in code quality and maintainability. Statically typed languages, while showing strength in these aspects, lack in learnability and ease of use. In particular, fixing type errors poses challenges to both novice users and experts. Further, compiler type error messages are presented in a static way that is biased toward the first occurrence of the error in the program code. To help users resolve such type errors we introduce ChameleonIDE, a type debugging tool that presents type errors to the user in an unbiased way, allowing them to explore the full context of where the errors could occur. Programmers can interactively verify the steps of reasoning against their intention. Through three studies involving actual programmers, we showed that ChameleonIDE is more effective in fixing type errors than traditional text-based error messages. This difference is more significant in harder tasks. Further, programmers actively using ChameleonIDE’s interactive features are shown to be more efficient in fixing type errors than passively reading the type error output.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124426251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Implant Global and Local Hierarchy Information to Sequence based Code Representation Models 在基于序列的代码表示模型中植入全局和局部层次信息

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) Pub Date : 2023-03-14 DOI: 10.1109/ICPC58990.2023.00030

Kechi Zhang, Zhuo Li, Zhi Jin, Ge Li

{"title":"Implant Global and Local Hierarchy Information to Sequence based Code Representation Models","authors":"Kechi Zhang, Zhuo Li, Zhi Jin, Ge Li","doi":"10.1109/ICPC58990.2023.00030","DOIUrl":"https://doi.org/10.1109/ICPC58990.2023.00030","url":null,"abstract":"Source code representation with deep learning techniques is an important research field. There have been many studies that learn sequential or structural information for code representation. But sequence-based models and non-sequence-models both have their limitations. Researchers attempt to incorporate structural information to sequence-based models, but they only mine part of token-level hierarchical structure information. In this paper, we analyze how the complete hierarchical structure influences the tokens in code sequences and abstract this influence as a property of code tokens called hierarchical embedding. The hierarchical embedding is further divided into statement-level global hierarchy and token-level local hierarchy. Furthermore, we propose the Hierarchy Transformer (HiT), a simple but effective sequence model to incorporate the complete hierarchical embeddings of source code into a Transformer model. We demonstrate the effectiveness of hierarchical embedding on learning code structure with an experiment on variable scope detection task. Further evaluation shows that HiT outperforms SOTA baseline models and show stable training efficiency on three source code-related tasks involving classification and generation tasks across 8 different datasets.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"240 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126814414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2