2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)最新文献

筛选
英文 中文
[Research Paper] Untangling Composite Commits Using Program Slicing [研究论文]使用程序切片解缠复合提交
Ward Muylaert, Coen De Roover
{"title":"[Research Paper] Untangling Composite Commits Using Program Slicing","authors":"Ward Muylaert, Coen De Roover","doi":"10.1109/SCAM.2018.00030","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00030","url":null,"abstract":"Composite commits are a common mistake in the use of version control software. A composite commit groups many unrelated tasks, rendering the commit difficult for developers to understand, revert, or integrate and for empirical researchers to analyse. We propose an algorithmic foundation for tool support to identify such composite commits. Our algorithm computes both a program dependence graph and the changes to the abstract syntax tree for the files that have been changed in a commit. Our algorithm then groups these fine-grained changes according to the slices through the dependence graph they belong to. To evaluate our technique, we analyse and refine an established dataset of Java commits, the results of which we also make available. We find that our algorithm can determine whether or not a commit is composite. For the majority of commits, this analysis takes but a few seconds. The parts of a commit that our algorithm identifies do not map directly to the commit's tasks. The parts tend to be smaller, but stay within their respective tasks.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"244 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114605187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
[Research Paper] Automatic Detection of Sources and Sinks in Arbitrary Java Libraries [研究论文]任意Java库中的源和汇自动检测
Darius Sas, Marco Bessi, F. Fontana
{"title":"[Research Paper] Automatic Detection of Sources and Sinks in Arbitrary Java Libraries","authors":"Darius Sas, Marco Bessi, F. Fontana","doi":"10.1109/SCAM.2018.00019","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00019","url":null,"abstract":"In the last decade, data security has become a primary concern for an increasing amount of companies around the world. Protecting the customer's privacy is now at the core of many businesses operating in any kind of market. Thus, the demand for new technologies to safeguard user data and prevent data breaches has increased accordingly. In this work, we investigate a machine learning-based approach to automatically extract sources and sinks from arbitrary Java libraries. Our method exploits several different features based on semantic, syntactic, intra-procedural dataflow and class-hierarchy traits embedded into the bytecode to distinguish sources and sinks. The performed experiments show that, under certain conditions and after some preprocessing, sources and sinks across different libraries share common characteristics that allow a machine learning model to distinguish them from the other library methods. The prototype model achieved remarkable results of 86% accuracy and 81% F-measure on our validation set of roughly 600 methods.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134173226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
[Research Paper] Fine-Grained Model Slicing for Rebel [研究论文]Rebel的细粒度模型切片
R. Eilers, Jurriaan Hage, I. Prasetya, Joost Bosman
{"title":"[Research Paper] Fine-Grained Model Slicing for Rebel","authors":"R. Eilers, Jurriaan Hage, I. Prasetya, Joost Bosman","doi":"10.1109/SCAM.2018.00035","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00035","url":null,"abstract":"In this paper, we apply fine-grained slicing techniques to the models generated from the Rebel modeling language before passing them on to an SMT solver. We show that our slicing techniques have a significant positive effect on performance, allowing us to verify larger problem instances and with higher path bounds than with unsliced models. For small and shallow instances, however, the overhead of slicing dominates verification time, and slicing should not be resorted to.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128766068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
[Research Paper] Obfuscating Java Programs by Translating Selected Portions of Bytecode to Native Libraries [研究论文]通过将选定的字节码部分翻译成本机库来混淆Java程序
Davide Pizzolotto, M. Ceccato
{"title":"[Research Paper] Obfuscating Java Programs by Translating Selected Portions of Bytecode to Native Libraries","authors":"Davide Pizzolotto, M. Ceccato","doi":"10.1109/SCAM.2018.00012","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00012","url":null,"abstract":"Code obfuscation is a popular approach to turn program comprehension and analysis harder, with the aim of mitigating threats related to malicious reverse engineering and code tampering. However, programming languages that compile to high level bytecode (e.g., Java) can be obfuscated only to a limited extent. In fact, high level bytecode still contains high level relevant information that an attacker might exploit. In order to enable more resilient obfuscations, part of these programs might be implemented with programming languages (e.g., C) that compile to low level machine-dependent code. In fact, machine code contains and leaks less high level information and it enables more resilient obfuscations. In this paper, we present an approach to automatically translate critical sections of high level Java bytecode to C code, so that more effective obfuscations can be resorted to. Moreover, a developer can still work with a single programming language, i.e., Java.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129557689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
[Research Paper] On the Use of Machine Learning Techniques Towards the Design of Cloud Based Automatic Code Clone Validation Tools [研究论文]机器学习技术在基于云的自动代码克隆验证工具设计中的应用
Golam Mostaeen, Jeffrey Svajlenko, B. Roy, C. Roy, Kevin A. Schneider
{"title":"[Research Paper] On the Use of Machine Learning Techniques Towards the Design of Cloud Based Automatic Code Clone Validation Tools","authors":"Golam Mostaeen, Jeffrey Svajlenko, B. Roy, C. Roy, Kevin A. Schneider","doi":"10.1109/SCAM.2018.00025","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00025","url":null,"abstract":"A code clone is a pair of code fragments, within or between software systems that are similar. Since code clones often negatively impact the maintainability of a software system, a great many numbers of code clone detection techniques and tools have been proposed and studied over the last decade. To detect all possible similar source code patterns in general, the clone detection tools work on syntax level (such as texts, tokens, AST and so on) while lacking user-specific preferences. This often means the reported clones must be manually validated prior to any analysis in order to filter out the true positive clones from task or user-specific considerations. This manual clone validation effort is very time-consuming and often error-prone, in particular for large-scale clone detection. In this paper, we propose a machine learning based approach for automating the validation process. In an experiment with clones detected by several clone detectors in several different software systems, we found our approach has an accuracy of up to 87.4% when compared against the manual validation by multiple expert judges. The proposed method shows promising results in several comparative studies with the existing related approaches for automatic code clone validation. We also present our experimental results in terms of different code clone detection tools, machine learning algorithms and open source software systems.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124496404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
[Engineering Paper] Analyzing the Evolution of Preprocessor-Based Variability: A Tale of a Thousand and One Scripts [工程论文]基于预处理器的变异性演化分析:一千零一个剧本的故事
Sandro Schulze, W. Fenske
{"title":"[Engineering Paper] Analyzing the Evolution of Preprocessor-Based Variability: A Tale of a Thousand and One Scripts","authors":"Sandro Schulze, W. Fenske","doi":"10.1109/SCAM.2018.00013","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00013","url":null,"abstract":"Highly configurable software systems allow the efficient and reliable development of similar software variants based on a common code base. The C preprocessor CPP, which uses source code annotations that enable conditional compilation, is a simple yet powerful text-based tool for implementing such systems. However, since annotations interfere with the actual source code, the CPP has often been accused of being a source of errors and increased maintenance effort. In our research, we have been curious about whether high-level patterns of CPP misuse (i.e., code smells) can be identified, how they evolve, and whether they really hinder maintenance. To support this research, we started a simple tool which over the years evolved into a powerful toolchain. This evolution was possible because our toolchain is not monolithic, but is composed of many small tools connected by scripts and communicating via files. Moreover, we reused existing tools whenever possible and developed our own solutions only as a last resort. In this paper, we report our experiences of building this toolchain. In particular, we present design decisions we made and lessons learned, both positive and negative ones. We hope that this not only stimulates discussion and (in the best case) attracts more researchers in using our tools. Rather, we also want to encourage others to put emphasis on building tools instead of considering them \"yet another research prototype\".","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124731240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
[Engineering Paper] Graal: The Quest for Source Code Knowledge [工程论文]格拉尔:对源代码知识的探索
Valerio Cosentino, Santiago Dueñas, Ahmed Zerouali, G. Robles, Jesus M. Gonzalez-Barahona
{"title":"[Engineering Paper] Graal: The Quest for Source Code Knowledge","authors":"Valerio Cosentino, Santiago Dueñas, Ahmed Zerouali, G. Robles, Jesus M. Gonzalez-Barahona","doi":"10.1109/SCAM.2018.00021","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00021","url":null,"abstract":"Source code analysis tools are designed to analyze code artifacts with different intents, which span from improving the quality and security of the software to easing refactoring and reverse engineering activities. However, most tools do not come with features to periodically schedule their analysis or to be executed on a battery of repositories, and lack support to combine their results with other analysis tools. Thus, researchers and practitioners are often forced to develop ad-hoc scripts to meet their needs. This comes at the risk of obtaining wrong results (because of the lack of testing) and of hindering replication by other research teams. In addition, the resulting scripts are often not meant to be customized nor designed for incrementality, scalability and extensibility. In this paper we present Graal, which empowers users with a customizable, scalable and incremental approach to conduct source code analysis and enables relating the obtained results with other software project data. Graal leverages on and extends the functionalities of GrimoireLab, a strong free software tool developed by Bitergia, a company devoted to offer commercial software development analytics, and part of the CHAOSS project of the Linux Foundation.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124132466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
[Research Paper] The Case for Adaptive Change Recommendation [研究论文]适应性变化建议的案例
Sydney Pugh, D. Binkley, L. Moonen
{"title":"[Research Paper] The Case for Adaptive Change Recommendation","authors":"Sydney Pugh, D. Binkley, L. Moonen","doi":"10.1109/SCAM.2018.00022","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00022","url":null,"abstract":"As the complexity of a software system grows, it becomes increasingly difficult for developers to be aware of all the dependencies that exist between artifacts (e.g., files or methods) of the system. Change impact analysis helps to overcome this problem, as it recommends to a developer relevant source-code artifacts related to her current changes. Association rule mining has shown promise in determining change impact by uncovering relevant patterns in the system's change history. State-of-the-art change impact mining algorithms typically make use of a change history of tens of thousands of transactions. For efficiency, targeted association rule mining focuses on only those transactions potentially relevant to answering a particular query. However, even targeted algorithms must consider the complete set of relevant transactions in the history. This paper presents ATARI, a new adaptive approach to association rule mining that considers a dynamic selection of the relevant transactions. It can be viewed as a further constrained version of targeted association rule mining, in which as few as a single transaction might be considered when determining change impact. Our investigation of adaptive change impact mining empirically studies seven algorithm variants. We show that adaptive algorithms are viable, can be just as applicable as the start-of-the-art complete-history algorithms, and even outperform them for certain queries. However, more important than the direct comparison, our investigation lays necessary groundwork for the future study of adaptive techniques and their application to challenges such as the on-the-fly style of impact analysis that is needed at the GitHub-scale.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126371633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
[Engineering Paper] SCC: Automatic Classification of Code Snippets [工程论文]SCC:代码片段自动分类
Kamel Alreshedy, Dhanush Dharmaretnam, D. Germán, Venkatesh Srinivasan, T. Gulliver
{"title":"[Engineering Paper] SCC: Automatic Classification of Code Snippets","authors":"Kamel Alreshedy, Dhanush Dharmaretnam, D. Germán, Venkatesh Srinivasan, T. Gulliver","doi":"10.1109/SCAM.2018.00031","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00031","url":null,"abstract":"Determining the programming language of a source code file has been considered in the research community; it has been shown that Machine Learning (ML) and Natural Language Processing (NLP) algorithms can be effective in identifying the programming language of source code files. However, determining the programming language of a code snippet or a few lines of source code is still a challenging task. Online forums such as Stack Overflow and code repositories such as GitHub contain a large number of code snippets. In this paper, we describe Source Code Classification (SCC), a classifier that can identify the programming language of code snippets written in 21 different programming languages. A Multinomial Naive Bayes (MNB) classifier is employed which is trained using Stack Overflow posts. It is shown to achieve an accuracy of 75% which is higher than that with Programming Languages Identification (PLI-a proprietary online classifier of snippets) whose accuracy is only 55.5%. The average score for precision, recall and the F1 score with the proposed tool are 0.76, 0.75 and 0.75, respectively. In addition, it can distinguish between code snippets from a family of programming languages such as C, C++ and C#, and can also identify the programming language version such as C# 3.0, C# 4.0 and C# 5.0.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134073255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
[Research Paper] Automatic Checking of Regular Expressions [研究论文]正则表达式的自动检查
E. Larson
{"title":"[Research Paper] Automatic Checking of Regular Expressions","authors":"E. Larson","doi":"10.1109/SCAM.2018.00034","DOIUrl":"https://doi.org/10.1109/SCAM.2018.00034","url":null,"abstract":"Regular expressions are extensively used to process strings. The regular expression language is concise which makes it easy for developers to use but also makes it easy for developers to make mistakes. Since regular expressions are compiled at run-time, the regular expression compiler does not give any feedback on potential errors. This paper describes ACRE - Automatic Checking of Regular Expressions. ACRE takes a regular expression as input and performs 11 different checks on the regular expression. The checks are based on common mistakes. Among the checks are checks for incorrect use of character sets (enclosed by []), wildcards (represented by.), and line anchors (^ and $). ACRE has found errors in 283 out of 826 regular expressions. Each of the 11 checks found at least seven errors. The number of false reports is moderate: 46 of the regular expressions contained a false report. ACRE is simple to use: the user enters a regular expressions and presses the check button. Any violations are reported back to the user with the incorrect portion of the regular expression highlighted. For 9 of the 11 checks, an example accepted string is generated that further illustrates the error.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127488690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信