2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)最新文献_第6页

Where Should I Look at? Recommending Lines that Reviewers Should Pay Attention To 我应该看哪里?推荐审稿人应该注意的内容

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.1109/saner53432.2022.00121

Yang Hong, C. Tantithamthavorn, Patanamon Thongtanunam

{"title":"Where Should I Look at? Recommending Lines that Reviewers Should Pay Attention To","authors":"Yang Hong, C. Tantithamthavorn, Patanamon Thongtanunam","doi":"10.1109/saner53432.2022.00121","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00121","url":null,"abstract":"Code review is an effective quality assurance practice, yet can be time-consuming since reviewers have to carefully review all new added lines in a patch. Our analysis shows that at the median, patch authors often waited 15–64 hours to receive initial feedback from reviewers, which accounts for 16%-26% of the whole review time of a patch. Importantly, we also found that large patches tend to receive initial feedback from reviewers slower than smaller patches. Hence, it would be beneficial to reviewers to reduce their effort with an approach to pinpoint the lines that they should pay attention to. In this paper, we proposed REVSPOT-a machine learning-based approach to predict problematic lines (i.e., lines that will receive a comment and lines that will be revised). Through a case study of three open-source projects (i.e., Openstack Nova, Openstack Ironic, and Qt Base), Revspot can accurately predict lines that will receive comments and will be revised (with a Top-10 Accuracy of 81% and 93%, which is 56% and 15% better than the baseline approach), and these correctly predicted problematic lines are related to logic defects, which could impact the functionality of the system. Based on these findings, our Revspot could help reviewers to reduce their reviewing effort by reviewing a smaller set of lines and increasing code review speed and reviewers' productivity.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"204 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131571018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Type Profiling to the Rescue: Test Amplification in Python and Smalltalk 拯救类型剖析:Python和Smalltalk中的测试放大

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.1109/saner53432.2022.00136

S. Demeyer, Mehrdad Abdi, Ebert Schoofs

引用次数: 0

Identifying Software Engineering Challenges in Software SMEs: A Case Study in Thailand 识别软件中小企业的软件工程挑战:泰国的案例研究

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.1109/saner53432.2022.00036

Chaiyong Ragkhitwetsagul, J. Krinke, Morakot Choetkiertikul, T. Sunetnanta, Federica Sarro

{"title":"Identifying Software Engineering Challenges in Software SMEs: A Case Study in Thailand","authors":"Chaiyong Ragkhitwetsagul, J. Krinke, Morakot Choetkiertikul, T. Sunetnanta, Federica Sarro","doi":"10.1109/saner53432.2022.00036","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00036","url":null,"abstract":"Small and medium-sized software enterprises (SSMEs) are a vital part of emerging markets. Due to their size, they are not capable of adopting advanced software engineering techniques or automated software engineering tools in the same way large and ultra-large companies are. We study the software engineering challenges in SSMEs in Thailand, an emerging market in software development, using semi-structured interviews with four SSMEs. After performing a thematic analysis of the interview transcripts, we found a number of common challenges such as lack of testing, code-related issues, and inaccurate effort estimation. We observed that in order to introduce advanced automated software engineering tools and techniques, SSMEs need to adopt contemporary best practices in software engineering like automated testing, continuous integration and automated code review. Moreover, we suggest that software engineering research engage with SSMEs to enable them to improve their knowledge and adopt more advanced software engineering practices.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"19 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120876919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Learning Program Semantics with Code Representations: An Empirical Study 用代码表示学习程序语义:一个实证研究

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.48550/arXiv.2203.11790

J. Siow, Shangqing Liu, Xiaofei Xie, Guozhu Meng, Yang Liu

{"title":"Learning Program Semantics with Code Representations: An Empirical Study","authors":"J. Siow, Shangqing Liu, Xiaofei Xie, Guozhu Meng, Yang Liu","doi":"10.48550/arXiv.2203.11790","DOIUrl":"https://doi.org/10.48550/arXiv.2203.11790","url":null,"abstract":"Program semantics learning is the core and fundamental for various code intelligent tasks e.g., vulnerability detection, clone detection. A considerable amount of existing works propose diverse approaches to learn the program semantics for different tasks and these works have achieved state-of-the-art performance. However, currently, a comprehensive and systematic study on evaluating different program representation techniques across diverse tasks is still missed. From this starting point, in this paper, we conduct an empirical study to evaluate different program representation techniques. Specifically, we categorize current mainstream code representation techniques into four categories i.e., Feature-based, Sequence-based, Tree-based, and Graph-based program representation technique and evaluate its performance on three diverse and popular code intelligent tasks i.e., Code Classification, Vulnerability Detection, and Clone Detection on the public released benchmark. We further design three research questions (RQs) and conduct a comprehensive analysis to investigate the performance. By the extensive experimental results, we conclude that (1) The graph-based representation is superior to the other selected techniques across these tasks. (2) Compared with the node type information used in tree-based and graph-based representations, the node textual information is more critical to learning the program semantics. (3) Different tasks require the task-specific semantics to achieve their highest performance, however combining various program semantics from different dimensions such as control dependency, data dependency can still produce promising results.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129524664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

HERMES: Using Commit-Issue Linking to Detect Vulnerability-Fixing Commits HERMES:使用提交-问题链接来检测漏洞修复提交

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.1109/saner53432.2022.00018

Giang Nguyen-Truong, Hong Jin Kang, D. Lo, Abhishek Sharma, A. Santosa, Asankhaya Sharma, Ming Yi Ang

{"title":"HERMES: Using Commit-Issue Linking to Detect Vulnerability-Fixing Commits","authors":"Giang Nguyen-Truong, Hong Jin Kang, D. Lo, Abhishek Sharma, A. Santosa, Asankhaya Sharma, Ming Yi Ang","doi":"10.1109/saner53432.2022.00018","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00018","url":null,"abstract":"Software projects today rely on many third-party libraries, and therefore, are exposed to vulnerabilities in these libraries. When a library vulnerability is fixed, users are notified and advised to upgrade to a new version of the library. However, not all vulnerabilities are publicly disclosed, and users may not be aware of vulnerabilities that may affect their applications. Due to the above challenges, there is a need for techniques which can identify and alert users to silent fixes in libraries; commits that fix bugs with security implications that are not officially disclosed. We propose a machine learning approach to automatically identify vulnerability-fixing commits. Existing techniques consider only data within a commit, such as its commit message, which does not always have sufficiently discriminative information. To address this limitation, our approach incorporates the rich source of information from issue trackers. When a commit does not link to an issue, we use a commit-issue link recovery technique to infer the potential missing link. Our experiments are promising; incorporating information from issue trackers boosts the performance of a vulnerability-fixing commit classifier, improving over the strongest baseline by 11.1% on the entire dataset, which includes commits that do not link to an issue. On a subset of the data in which all commits explicitly link to an issue, our approach improves over the baseline by 12.5%.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130622084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Towards a Fine-grained Analysis of Cognitive Load During Program Comprehension 程序理解过程中认知负荷的细粒度分析

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.1109/saner53432.2022.00092

Thierry Sorg, Amine Abbad Andaloussi, Barbara Weber

引用次数: 2

Blockchain-Oriented Software Variant Forks: A Preliminary Study 面向区块链的软件变体分叉:初步研究

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.48550/arXiv.2204.11083

Henrique Rocha, John Businge

引用次数: 3

PANDORA: Continuous Mining Software Repository and Dataset Generation PANDORA:持续挖掘软件存储库和数据集生成

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.1109/saner53432.2022.00041

H. Nguyen, Francesco Lomio, Fabiano Pecorelli, Valentina Lenarduzzi

{"title":"PANDORA: Continuous Mining Software Repository and Dataset Generation","authors":"H. Nguyen, Francesco Lomio, Fabiano Pecorelli, Valentina Lenarduzzi","doi":"10.1109/saner53432.2022.00041","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00041","url":null,"abstract":"During the mining software repository activities, a huge amount of data gathered from different sources is analyzed. Different tools have been developed for collecting and aggregating data from repositories, but they do not easily allow researchers to develop new extractors, to integrate the data collected from other platforms, and in particular from platforms that delete the data periodically. Moreover, mining software repository studies are commonly performed on old versions of software projects and their results are not commonly periodically updated. As a result of the non-continuously updated studies, practitioners often do not trust results from empirical studies. In order to overcome the aforementioned issues, in this paper, we present Pandora, a tool that automatically and continuously mines data from different existing tools and online platforms and enables to run and continuously update the results of mining software repository studies. To evaluate the applicability of our tool, we currently analyzed 365 projects (developed in different languages), continuously collecting data from December 2020 to May 2021 and running an example study, investigating the build-stability of SonarQube rules. Link to dashboard: http://sqa.rd.tuni.fi/superset/dashboard/1 Link to source code: https://github.com/clowee/PANDORA Link to 5-minutes video: https://youtu.be/CuVO9YGJ59I","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129602693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Towards Attribute Grammar Mining by Symbolic Execution 基于符号执行的属性语法挖掘

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.1109/saner53432.2022.00100

M. Moser, J. Pichler, A. Pointner

{"title":"Towards Attribute Grammar Mining by Symbolic Execution","authors":"M. Moser, J. Pichler, A. Pointner","doi":"10.1109/saner53432.2022.00100","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00100","url":null,"abstract":"The specification of program inputs is a requirement for many software engineering tasks, but often does not exist or is out of date. To tackle this problem, software engineers may apply program analysis techniques to extract parts of a specification from the source code that processes the program input. Today there are analysis techniques for the extraction of constraints (mathematical formulas) for individual program inputs (e.g. function parameters) as well as emerging techniques for inferring context-free grammars that specify the syntax of program input strings. However, such techniques focus on a single aspect (e.g., constraints or grammars) of the specification only and neglect the other one. We propose to integrate such analysis techniques by extending existing approaches for mining input grammars with the extraction of constraints. Constraints are integrated with a grammar in the form of attributes and context constraints on grammar symbols, resulting in an attribute grammar as specification format. To achieve this goal, we choose the analysis method dynamic symbolic execution (DSE), which is already an established technique for the extraction of constraints and beneficial for grammar mining (e.g., through automatic input generation) as well. Thus, DSE not only covers both aspects but also—as a single analysis method—should facilitate the integration of these two aspects. In this paper, we describe the basic idea of the proposed integration and report the first results on DSE-based grammar extraction.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"46 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113969395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Preliminary Analysis of GPL-Related License Violations in Docker Images Docker镜像中gpl相关许可违规行为的初步分析

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2022-03-01 DOI: 10.1109/saner53432.2022.00059

Yunosuke Higashi, Katsunori Fukui, Yutaro Kashiwa, M. Ohira

{"title":"A Preliminary Analysis of GPL-Related License Violations in Docker Images","authors":"Yunosuke Higashi, Katsunori Fukui, Yutaro Kashiwa, M. Ohira","doi":"10.1109/saner53432.2022.00059","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00059","url":null,"abstract":"Background: In recent years, the use of container virtualization technology has been rapidly spreading to speed up software release and operation. In general, a containerized application image (e.g., Docker image) consists of multiple reused OSS packages. To reuse OSS, it is necessary to comply with the OSS licenses. Although there have been many studies on OSS license detection and license compatibility among OSS packages, but to the best of our knowledge, there is no study tackled with incompatible license problems among OSS packages in a container image. Aims: In this paper, we conduct a preliminary analysis to clarify the extent to which Docker images contain OSS license incompatibility problems. Method: We analyze 776 Docker images published on GitHub to determine whether license incompatibilities among OSS packages exist. Results: The analysis showed that a total of 2,167 software packages were used in the 776 Docker images. The majority of the software packages (71.3%) are compatible with the GPL family, but a non-negligible number of software packages (28.7%) are not compatible. The analysis also showed that 457 (58.9%) of the 776 images had GPL-related incompatibility problems. Conclusions: Unlike traditional software development, in which software packages to be reused are explicitly combined, Dockerfile creators who build and distribute Docker images might be less aware of the risks related to compatibility between OSS licenses. Our results are useful as information to improve the awareness of Dockerfile creators, and also indicates the necessity of future studies to detect and prevent the inclusion of license-incompatible OSS packages to container images.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116076226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0