2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)最新文献

筛选
英文 中文
Web Element Identification by Combining NLP and Heuristic Search for Web Testing 结合NLP和启发式搜索的Web元素识别方法
Hiroyuki Kirinuki, S. Matsumoto, Yoshiki Higo, S. Kusumoto
{"title":"Web Element Identification by Combining NLP and Heuristic Search for Web Testing","authors":"Hiroyuki Kirinuki, S. Matsumoto, Yoshiki Higo, S. Kusumoto","doi":"10.1109/saner53432.2022.00123","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00123","url":null,"abstract":"End-to-end test automation is critical in modern web application development. However, test automation techniques used in industry face challenges in implementing and maintaining test scripts. It is difficult to determine and maintain the locators needed by test scripts to identify web elements on web pages. The reason is that locators depend on the metadata of web elements and the structure of each web page. One effective way to solve such a problem of locators is to allow test cases written in natural language to be executed without test scripts. In this study, we propose a technique to identify web elements that should be operated on a web page by interpreting natural-language-like test cases. The test cases are written in a domain-specific language that independents on the metadata of web elements and the structural information of web pages. We leverage natural language processing techniques to understand the semantics of web elements. We also create heuristic search algorithms to explore web pages and find promising test procedures. To evaluate the proposed technique, we applied it to test cases for two open-source web applications. The experimental results show that our technique was able to successfully identify about 94% of web elements to be operated in the test cases. Our approach also succeeded in identifying all the web elements that were operated in 68% of the test cases.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125056804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
BinMLM: Binary Authorship Verification with Flow-aware Mixture-of-Shared Language Model 具有流感知的共享混合语言模型的二进制作者验证
Qi Song, Yongzheng Zhang, Linshu Ouyang, Yige Chen
{"title":"BinMLM: Binary Authorship Verification with Flow-aware Mixture-of-Shared Language Model","authors":"Qi Song, Yongzheng Zhang, Linshu Ouyang, Yige Chen","doi":"10.48550/arXiv.2203.04472","DOIUrl":"https://doi.org/10.48550/arXiv.2203.04472","url":null,"abstract":"Binary authorship analysis is a significant problem in many software engineering applications. In this paper, we formulate a binary authorship verification task to accurately reflect the real-world working process of software forensic experts. It aims to determine whether an anonymous binary is developed by a specific programmer with a small set of support samples, and the actual developer may not belong to the known candidate set but from the wild. We propose an effective binary authorship verification framework, BinMLM. BinMLM trains the RNN language model on consecutive opcode traces extracted from the control-flow-graph (CFG) to characterize the candidate developers' programming styles. We build a mixture-of-shared architecture with multiple shared encoders and author-specific gate layers, which can learn the developers' combination preferences of universal programming patterns and alleviate the problem of low training resources. Through an optimization pipeline of external pre-training, joint training, and fine-tuning, our framework can eliminate additional noise and accurately distill developers' unique styles. Extensive experiments show that BinMLM achieves promising results on Google Code Jam (GCJ) and Codeforces datasets with different numbers of programmers and supporting samples. It significantly outperforms the baselines built on the state-of-the-art feature set (4.73% to 19.46% improvement) and remains robust in multi-author collaboration scenarios. Furthermore, Bin-MLM can perform organization-level verification on a real-world APT malware dataset, which can provide valuable auxiliary information for exploring the group behind the APT attack.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129083507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
VCMatch: A Ranking-based Approach for Automatic Security Patches Localization for OSS Vulnerabilities VCMatch:基于排名的OSS漏洞安全补丁自动定位方法
Shichao Wang, Yun Zhang, Liagfeng Bao, Xin Xia, Ming-hui Wu
{"title":"VCMatch: A Ranking-based Approach for Automatic Security Patches Localization for OSS Vulnerabilities","authors":"Shichao Wang, Yun Zhang, Liagfeng Bao, Xin Xia, Ming-hui Wu","doi":"10.1109/saner53432.2022.00076","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00076","url":null,"abstract":"Nowadays, vulnerabilities in open source software (OSS) are constantly emerging, posing a great threat to application security. Security patches are crucial in reducing the risk of OSS vulnerabilities. However, many of the vulnerabilities disclosed by CVE/NVD are not accompanied by security patches. Previous research has shown that the auxiliary information in CVE/NVD can aid in the matching of a vulnerability to appropriate commits. The state-of-art research proposed a rank-based approach based on the multiple dimensions of features extracted from the auxiliary information in CVE/NVD. However, this approach ignores the semantic features in the vulnerability descriptions and commit messages, making the model still have room for improvement. In this paper, we propose a novel ranking-based approach VCMATCH (Vulnerability-Commit Match). In addition to extracting the shallow statistical features between the vulnerability and the patch commit, VCMATCH extracts the deep semantic features of the vulnerability descriptions and commit messages. Besides, VCMATCH applies three classification models (i.e., XGBoost, LightGBM, CNN) and uses a voting-based rank fusion method to combine the results of the three models to generate a better result. We evaluate VCMATCH with 1,669 CVEs from 10 OSS projects. The experiment results show that VCMATCH can effectively identify security patches for OSS vulnerabilities in terms of Recall@K and Manual Effort@K, and outperforms the state-of-art model by a statistically significant margin.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130449313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatically Generating Code Comment Using Heterogeneous Graph Neural Networks 使用异构图神经网络自动生成代码注释
Dun Jin, Peiyu Liu, Zhenfang Zhu
{"title":"Automatically Generating Code Comment Using Heterogeneous Graph Neural Networks","authors":"Dun Jin, Peiyu Liu, Zhenfang Zhu","doi":"10.1109/saner53432.2022.00125","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00125","url":null,"abstract":"Code summarization aims to generate readable summaries that describe the functionality of source code pieces. The main purpose of the code summarization is to help software developers understand the code and save their precious time. However, since programming languages are highly structured, it is challenging to generate high-quality code summaries. For this reason, this paper proposes a new approach named CCHG to automatically generate code comments. Compared to recent models that use additional information such as Abstract Syntax Trees as input, our proposed method only uses the most original code as input. We believe that programming languages are the same as natural languages. Each line of code is equivalent to a sentence, representing an independent meaning. Therefore, we split the entire code snippet into several sentence-level code. Coupled with token-level code, there are two types of code that need to be processed. So we propose heterogeneous graph networks to process the sentence-level and token-level code. Even though we do not introduce additional structural knowledge, the experimental results show that our model has a considerable performance, which indicates that our model can fully learn structural information and sequence information from code snippets.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131642025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Common Programming Mistakes Leading to Information Disclosure: A Preliminary Study 导致信息泄露的常见编程错误初探
Gowri Pandian Sundarapandi, Raiyan Hossain, Chandana Jasrai, Kazi Zakia Sultana, Zadia Codabux
{"title":"Common Programming Mistakes Leading to Information Disclosure: A Preliminary Study","authors":"Gowri Pandian Sundarapandi, Raiyan Hossain, Chandana Jasrai, Kazi Zakia Sultana, Zadia Codabux","doi":"10.1109/saner53432.2022.00091","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00091","url":null,"abstract":"It is vital to engineer robust and secure software. Many security strategies and techniques have been proposed. However, technological growth increases security concerns and demands persistent software security analysis. The objective of our study is to analyze vulnerable code components of real-world software code repositories and mine developers' frequent programming mistakes, resulting in information disclosure in the software. Finding common programming mistakes during the implementation phase is a primary step towards building secure software. We investigate the published vulnerabilities in two open-source applications: Apache Tomcat and Android. We focus on the information disclosure vulnerability reported as security advisories and analyze the code to extract or mine the causes of the vulnerability. We found that improper or lack of bound checking is the most frequent programming mistake that can potentially cause information leakage. Our findings can help create awareness among developers of the common programming mistakes that lead to disclosing sensitive information to avoid it, or if such mistakes are already present in the code, they can be handled during the implementation phase. Moreover, our results can be incorporated in tools such as static analyzers to help detect information disclosure instances more accurately prior to software delivery.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125696005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can Solana be the Solution to the Blockchain Scalability Problem? Solana可以解决区块链的可扩展性问题吗?
G. A. Pierro, R. Tonelli
{"title":"Can Solana be the Solution to the Blockchain Scalability Problem?","authors":"G. A. Pierro, R. Tonelli","doi":"10.1109/saner53432.2022.00144","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00144","url":null,"abstract":"Solana is a public blockchain platform, launched in April 2018, that aims to increase scalability when compared to other blockchains without compromising decentralization and security. It supports smart contracts and the creation of decentralized applications (DApps). The study aims to collect data from the Solana blockchain and verify some of its properties such as its transactions' throughput, i.e. the rate at which valid transactions are committed into a block by the Solana blockchain during a one second interval of time (TPS). The data were collected over the period of two months (14 October - 15 December) and made public on a GitHub repository. The results of our data analysis show how the average transactions' throughput is about 2812 TPS and that the fees paid by users to have the transactions confirmed are on average much lower than the fees users pay for other blockchains that support the same functions, such as smart contract and the creation of DApps. The paper sheds light on the mechanisms of Solana blockchain that, according to their founders, promises to solve the scalability problem without sacrificing decentralization and security.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"398 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124382545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Investigating the Effectiveness of Clustering for Story Point Estimation 研究聚类在故事点估计中的有效性
Vali Tawosi, A. Al-Subaihin, Federica Sarro
{"title":"Investigating the Effectiveness of Clustering for Story Point Estimation","authors":"Vali Tawosi, A. Al-Subaihin, Federica Sarro","doi":"10.1109/saner53432.2022.00101","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00101","url":null,"abstract":"Automated techniques to estimate Story Points (SP) for user stories in agile software development came to the fore a decade ago. Yet, the state-of-the-art estimation techniques' accuracy has room for improvement. In this paper, we present a new approach for SP estimation, based on analysing textual features of software issues by employing latent Dirichlet allocation (LDA) and clustering. We first use LDA to represent issue reports in a new space of generated topics. We then use hierarchical clustering to agglomerate issues into clusters based on their topic similarities. Next, we build estimation models using the issues in each cluster. Then, we find the closest cluster to the new coming issue and use the model from that cluster to estimate the SP. Our approach is evaluated on a dataset of 26 open source projects with a total of 31,960 issues and compared against both baselines and state-of-the-art SP estimation techniques. The results show that the estimation performance of our proposed approach is as good as the state-of-the-art. However, none of these approaches is statistically significantly better than more naive estimators in all cases, which does not justify their additional complexity. We therefore encourage future work to develop alternative strategies for story points estimation. The experimental data and scripts we used in this work are publicly available to allow for replication and extension.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114785339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
ProPy: Prolog-based Fault Localization Tool for Python ProPy:基于prolog的Python故障定位工具
Janneke Morin, Krishnendu Ghosh
{"title":"ProPy: Prolog-based Fault Localization Tool for Python","authors":"Janneke Morin, Krishnendu Ghosh","doi":"10.1109/saner53432.2022.00137","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00137","url":null,"abstract":"Fault localization involves determining the root location of a program fault. The quicker and more accurately a fault is located, the easier it is to address. Hence, the software development cycle reaps the benefits of improvements in the efficiency, effectiveness, and speed of the fault localization process. Fault localization techniques developed through research often involve running a set of test cases on the program in question, then comparing their expected and actual results. By examining which statements of the program were executed in successful versus unsuccessful test cases, then extract insight into the “suspiciousness” of areas of the program. Many of the tools that execute these techniques are written in imperative programming languages that are not powerful enough to support recursively-called functions such as transitive closure at scale. Our research explores the utility of declarative programming languages in the fault localization problem as they are known to support efficient, built-in functions for such recursive actions. Specifically, we build upon existing work by combining the use of the declarative language Prolog with analyzing communities detected from Python control flow graphs. The source code is available at https://github.com/jannekemorin/new-leaf.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114703818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can We Automatically Fix Bugs by Learning Edit Operations? 我们可以通过学习编辑操作来自动修复bug吗?
Aidan Connor, Aaron Harris, Nathan Cooper, D. Poshyvanyk
{"title":"Can We Automatically Fix Bugs by Learning Edit Operations?","authors":"Aidan Connor, Aaron Harris, Nathan Cooper, D. Poshyvanyk","doi":"10.1109/saner53432.2022.00096","DOIUrl":"https://doi.org/10.1109/saner53432.2022.00096","url":null,"abstract":"There has been much work done in the area of automated program repair, specifically through using machine learning methods to correct buggy code. Whereas some degree of success has been attained by those efforts, there is still considerable room for growth with regard to the accuracy of results produced by such tools. In that vein, we implement Hephaestus, a novel method to improve the accuracy of automated bug repair through learning to apply edit operations. Hephaestus leverages neural machine translation and attempts to produce the edit operations needed to correct a given buggy code segment to a fixed version. We examine the effects of using various forms of edit operations in the completion of this task. Our study found that all models which learned from edit operations were not as effective at repairing bugs as models which learned from fixed code segments directly. This evidences that learning edit operations does not offer an advantage over the standard approach of translating directly from buggy code to fixed code. We conduct an analysis of this lowered efficiency and explore why the complexity of the edit operations-based models may be suboptimal. Interestingly, even though our Hephaestus model exhibited lower translation accuracy than the baseline, Hephaestus was able to perform successful bug repair. This success, albeit small, leaves the door open for other researchers to innovate unique solutions in the realm of automatic bug repair.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"267 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115238184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Replication Study on Predicting Metamorphic Relations at Unit Testing Level 在单元测试水平上预测变质关系的复制研究
Alejandra Duque-Torres, Dietmar Pfahl, R. Ramler, Claus Klammer
{"title":"A Replication Study on Predicting Metamorphic Relations at Unit Testing Level","authors":"Alejandra Duque-Torres, Dietmar Pfahl, R. Ramler, Claus Klammer","doi":"10.1109/SANER53432.2022.00088","DOIUrl":"https://doi.org/10.1109/SANER53432.2022.00088","url":null,"abstract":"Metamorphic Testing (MT) addresses the test oracle problem by examining the relations between inputs and outputs of test executions. Such relations are known as Metamorphic Relations (MRs). In current practice, identifying and selecting suitable MRs is usually a challenging manual task, requiring a thorough grasp of the SUT and its application domain. Thus, Kanewala et al. proposed the Predicting Metamorphic Relations (PMR) approach to automatically suggest MRs from a list of six pre-defined MRs for testing newly developed methods. PMR is based on a classification model trained on features extracted from the control-flow graph (CFG) of 100 Java methods. In our replication study, we explore the generalizability of PMR. First, since not all details necessary for a replication are provided, we rebuild the entire preprocessing and training pipeline and repeat the original study in a close replication to verify the reported results and establish the basis for further experiments. Second, we perform a conceptual replication to explore the reusability of the PMR model trained on CFGs from Java methods in the first step for functionally identical methods implemented in Python and C++. Finally, we retrain the model on the CFGs from the Python and C++ methods to investigate the dependence on programming language and implementation details. We were able to successfully replicate the original study achieving comparable results for the Java methods set. However, the prediction performance of the Java-based classifiers significantly decreases when applied to functionally equivalent Python and C++ methods despite using only CFG features to abstract from language details. Since the performance improved again when the classifiers were retrained on the CFGs of the methods written in Python and C++, we conclude that the PMR approach can be generalized, but only when classifiers are developed starting from code artefacts in the used programming language.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127914872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信