2010 10th IEEE Working Conference on Source Code Analysis and Manipulation最新文献

筛选
英文 中文
Language-Independent Clone Detection Applied to Plagiarism Detection 语言无关克隆检测在抄袭检测中的应用
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.19
Romain Brixtel, Mathieu Fontaine, Boris Lesner, Cyril Bazin, R. Robbes
{"title":"Language-Independent Clone Detection Applied to Plagiarism Detection","authors":"Romain Brixtel, Mathieu Fontaine, Boris Lesner, Cyril Bazin, R. Robbes","doi":"10.1109/SCAM.2010.19","DOIUrl":"https://doi.org/10.1109/SCAM.2010.19","url":null,"abstract":"Clone detection is usually applied in the context of detecting small-to medium scale fragments of duplicated code in large software systems. In this paper, we address the problem of clone detection applied to plagiarism detection in the context of source code assignments done by computer science students. Plagiarism detection comes with a distinct set of constraints to usual clone detection approaches, which influenced the design of the approach we present in this paper. For instance, the source code can be heavily changed at a superficial level (in an attempt to look genuine), yet be functionally very similar. Since assignments turned in by computer science students can be in a variety of languages, we work at the syntactic level and do not consider the source-code semantics. Consequently, the approach we propose is endogenous and makes no assumption about the programming language being analysed. It is based on an alignment method using the parallel principle at local resolution (character level) to compute similarities between documents. We tested our framework on hundreds of real source files, involving a wide array of programming languages (Java, C, Python, PHP, Haskell, bash). Our approach allowed us to discover previously undetected frauds, and to empirically evaluate its accuracy and robustness.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122813749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
How Good is Static Analysis at Finding Concurrency Bugs? 静态分析在发现并发bug方面有多好?
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.26
Devin Kester, Martin Mwebesa, J. S. Bradbury
{"title":"How Good is Static Analysis at Finding Concurrency Bugs?","authors":"Devin Kester, Martin Mwebesa, J. S. Bradbury","doi":"10.1109/SCAM.2010.26","DOIUrl":"https://doi.org/10.1109/SCAM.2010.26","url":null,"abstract":"Detecting bugs in concurrent software is challenging due to the many different thread interleavings. Dynamic analysis and testing solutions to bug detection are often costly as they need to provide coverage of the interleaving space in addition to traditional black box or white box coverage. An alternative to dynamic analysis detection of concurrency bugs is the use of static analysis. This paper examines the use of three static analysis tools (Find Bugs, J Lint and Chord) in order to assess each tool's ability to find concurrency bugs and to identify the percentage of spurious results produced. The empirical data presented is based on an experiment involving 12 concurrent Java programs.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126351994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
MemSafe: Ensuring the Spatial and Temporal Memory Safety of C at Runtime MemSafe:确保C在运行时的空间和时间内存安全
Matthew S. Simpson, R. Barua
{"title":"MemSafe: Ensuring the Spatial and Temporal Memory Safety of C at Runtime","authors":"Matthew S. Simpson, R. Barua","doi":"10.1002/spe.2105","DOIUrl":"https://doi.org/10.1002/spe.2105","url":null,"abstract":"Memory access violations are a leading source of unreliability in C programs. As evidence of this problem, a variety of methods exist that retrofit C with software checks to detect memory errors at runtime. However, these methods generally suffer from one or more drawbacks including the inability to detect all errors, the use of incompatible metadata, the need for manual code modifications, and high runtime overheads. In this paper, we present a compiler analysis and transformation for ensuring the memory safety of C called MemSafe. MemSafe makes several novel contributions that improve upon previous work and lower the cost of safety. These include (1) a method for modeling temporal errors as spatial errors, (2) a metadata representation that combines features of both object - and pointer-based approaches, and (3) a dataflow representation that simplifies optimizations for removing unneeded checks. MemSafe is capable of detecting real errors with lower overheads than previous efforts. Experimental results show that MemSafe detects all memory errors in 6 programs with known violations and ensures complete safety with an average overhead of 87% on 30 large programs widely-used in evaluating error detection tools.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115396969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 87
Parallel Reachability and Escape Analyses 并行可达性和逸出分析
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.10
Marcus Edvinsson, Jonas Lundberg, Welf Löwe
{"title":"Parallel Reachability and Escape Analyses","authors":"Marcus Edvinsson, Jonas Lundberg, Welf Löwe","doi":"10.1109/SCAM.2010.10","DOIUrl":"https://doi.org/10.1109/SCAM.2010.10","url":null,"abstract":"Static program analysis usually consists of a number of steps, each producing partial results. For example, the points-to analysis step, calculating object references in a program, usually just provides the input for larger client analyses like reach ability and escape analyses. All these analyses are computationally intense and it is therefore vital to create parallel approaches that make use of the processing power that comes from multiple cores in modern desktop computers. The present paper presents two parallel approaches to increase the efficiency of reach ability analysis and escape analysis, based on a parallel points-to analysis. The experiments show that the two parallel approaches achieve a speed-up of 1.5 for reach ability analysis and 3.8 for escape analysis on 8 cores for a benchmark suite of Java programs.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114380190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Deriving Coupling Metrics from Call Graphs 从调用图派生耦合度量
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.25
Simon Allier, S. Vaucher, Bruno Dufour, H. Sahraoui
{"title":"Deriving Coupling Metrics from Call Graphs","authors":"Simon Allier, S. Vaucher, Bruno Dufour, H. Sahraoui","doi":"10.1109/SCAM.2010.25","DOIUrl":"https://doi.org/10.1109/SCAM.2010.25","url":null,"abstract":"Coupling metrics play an important role in empirical software engineering research as well as in industrial measurement programs. The existing coupling metrics have usually been defined in a way that they can be computed from a static analysis of the source code. However, modern programs extensively use dynamic language features such as polymorphism and dynamic class loading that are difficult to capture by static analysis. Consequently, the derived metric values might not accurately reflect the state of a program. In this paper, we express existing definitions of coupling metrics using call graphs. We then compare the results of four different call graph construction algorithms with standard tool implementations of these metrics in an empirical study. Our results show important variations in coupling between standard and call graph-based calculations due to the support of dynamic features.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130457701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Recovering the Memory Behavior of Executable Programs 恢复可执行程序的内存行为
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.18
A. Ketterlin, P. Clauss
{"title":"Recovering the Memory Behavior of Executable Programs","authors":"A. Ketterlin, P. Clauss","doi":"10.1109/SCAM.2010.18","DOIUrl":"https://doi.org/10.1109/SCAM.2010.18","url":null,"abstract":"This paper deals with the binary analysis of executable programs, with the goal of understanding how they access memory. It explains how to statically build a formal model of all memory accesses. Starting with a control-flow graph of each procedure, well-known techniques are used to structure this graph into a hierarchy of loops in all cases. The paper shows that much more information can be extracted by performing a complete data-flow analysis over machine registers after the program has been put in static single assignment (SSA) form. By using the SSA form, registers used in addressing memory can be symbolically expressed in terms of other, previously set registers. By including the loop structures in the analysis, loop indices and trip counts can also often be expressed symbolically. The whole process produces a formal model made of loops where memory accesses are linear expressions of loop counters and registers. The paper provides a quantitative evaluation of the results when applied to several dozens of SPEC benchmark programs. Because static analysis is often incomplete, the paper ends by describing a lightweight instrumentation strategy that collects at run time enough information to complete the program's symbolic description.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"18 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114045064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reconstruction of Composite Types for Decompilation 用于反编译的复合类型重构
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.24
K. Troshina, Yegor Derevenets, A. Chernov
{"title":"Reconstruction of Composite Types for Decompilation","authors":"K. Troshina, Yegor Derevenets, A. Chernov","doi":"10.1109/SCAM.2010.24","DOIUrl":"https://doi.org/10.1109/SCAM.2010.24","url":null,"abstract":"Decompilation is reconstruction of a program in a high-level language from a program in a low-level language. This paper presents a method for automatic reconstruction of composite types (structures, arrays and combinations of them)in a high-level program during decompilation. Assembly code is obtained by disassembling a binary code or traces collected by a simulator. The proposed method is based on expressing memory access operations as pairs base offset, then building equivalence classes for the bases used in the program and accumulating offsets for each equivalence class. For Strictly conforming C programs our approach is substantiated by the C language semantics as defined in the international standard. However, experimental results have revealed that it is applicable for real-world programs also. Experimental results are obtained for a number of open-source programs as well as for traces collected from them. The method is an essential part of the tool for program decompilation TyDec being developed by the authors. Decompiler TyDec can be used as a standalone tool or as a plug-in for Interactive Trace Explorer TrEx being developed in Institute for System Programming, Russian Academy of Sciences.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128421760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Subclass Instantiation Distribution 子类实例化分布
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.12
Amy Wheeler, D. Binkley
{"title":"Subclass Instantiation Distribution","authors":"Amy Wheeler, D. Binkley","doi":"10.1109/SCAM.2010.12","DOIUrl":"https://doi.org/10.1109/SCAM.2010.12","url":null,"abstract":"During execution, an objected-oriented program typically creates a large number of objects. This research considers the distribution of those objects that share a common su per class. If this distribution is uniform then all subclasses are equally likely to be instantiated. However, if not, then the lack of uniformity can be exploited by giving preferential treatment to the dominant class (or classes). For example, a tester might spend greater testing resources on the dominant class while an engineer refactoring the code might begin with a more dominant class. An experiment designed to investigate the distribution of subclass instantiations was performed using eight Java programs containing almost half a million lines of code and just over three thousand classes. The results show that outside a few infrequent instances, most distributions are heavily skewed.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128446867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AMBIDEXTER: Practical Ambiguity Detection 实用的歧义检测
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.21
Bas Basten, T. Storm
{"title":"AMBIDEXTER: Practical Ambiguity Detection","authors":"Bas Basten, T. Storm","doi":"10.1109/SCAM.2010.21","DOIUrl":"https://doi.org/10.1109/SCAM.2010.21","url":null,"abstract":"Ambiguity detection tools try to statically track down ambiguities in context-free grammars. Current ambiguity detection tools, however, either are too slow for large realistic cases, or produce incomprehensible ambiguity reports. AmbiDexter is the ambiguity tool to have your cake and eat it too.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114674118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Estimating the Optimal Number of Latent Concepts in Source Code Analysis 估计源代码分析中潜在概念的最优数量
2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.22
Scott Grant, J. Cordy
{"title":"Estimating the Optimal Number of Latent Concepts in Source Code Analysis","authors":"Scott Grant, J. Cordy","doi":"10.1109/SCAM.2010.22","DOIUrl":"https://doi.org/10.1109/SCAM.2010.22","url":null,"abstract":"The optimal number of latent topics required to model the most accurate latent substructure for a source code corpus is an open question in source code analysis. Most estimates about the number of latent topics that exist in a software corpus are based on the assumption that the data is similar to natural language, but there is little empirical evidence to support this. In order to help determine the appropriate number of topics needed to accurately represent the source code, we generate a series of Latent Dirichlet Allocation models with varying topic counts. We use a heuristic to evaluate the ability of the model to identify related source code blocks, and demonstrate the consequences of choosing too few or too many latent topics.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127173663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信