2010 10th IEEE Working Conference on Source Code Analysis and Manipulation最新文献

Language-Independent Clone Detection Applied to Plagiarism Detection 语言无关克隆检测在抄袭检测中的应用

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.19

Romain Brixtel, Mathieu Fontaine, Boris Lesner, Cyril Bazin, R. Robbes

{"title":"Language-Independent Clone Detection Applied to Plagiarism Detection","authors":"Romain Brixtel, Mathieu Fontaine, Boris Lesner, Cyril Bazin, R. Robbes","doi":"10.1109/SCAM.2010.19","DOIUrl":"https://doi.org/10.1109/SCAM.2010.19","url":null,"abstract":"Clone detection is usually applied in the context of detecting small-to medium scale fragments of duplicated code in large software systems. In this paper, we address the problem of clone detection applied to plagiarism detection in the context of source code assignments done by computer science students. Plagiarism detection comes with a distinct set of constraints to usual clone detection approaches, which influenced the design of the approach we present in this paper. For instance, the source code can be heavily changed at a superficial level (in an attempt to look genuine), yet be functionally very similar. Since assignments turned in by computer science students can be in a variety of languages, we work at the syntactic level and do not consider the source-code semantics. Consequently, the approach we propose is endogenous and makes no assumption about the programming language being analysed. It is based on an alignment method using the parallel principle at local resolution (character level) to compute similarities between documents. We tested our framework on hundreds of real source files, involving a wide array of programming languages (Java, C, Python, PHP, Haskell, bash). Our approach allowed us to discover previously undetected frauds, and to empirically evaluate its accuracy and robustness.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122813749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 61

How Good is Static Analysis at Finding Concurrency Bugs? 静态分析在发现并发bug方面有多好?

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.26

Devin Kester, Martin Mwebesa, J. S. Bradbury

引用次数: 19

MemSafe: Ensuring the Spatial and Temporal Memory Safety of C at Runtime MemSafe:确保C在运行时的空间和时间内存安全

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1002/spe.2105

Matthew S. Simpson, R. Barua

{"title":"MemSafe: Ensuring the Spatial and Temporal Memory Safety of C at Runtime","authors":"Matthew S. Simpson, R. Barua","doi":"10.1002/spe.2105","DOIUrl":"https://doi.org/10.1002/spe.2105","url":null,"abstract":"Memory access violations are a leading source of unreliability in C programs. As evidence of this problem, a variety of methods exist that retrofit C with software checks to detect memory errors at runtime. However, these methods generally suffer from one or more drawbacks including the inability to detect all errors, the use of incompatible metadata, the need for manual code modifications, and high runtime overheads. In this paper, we present a compiler analysis and transformation for ensuring the memory safety of C called MemSafe. MemSafe makes several novel contributions that improve upon previous work and lower the cost of safety. These include (1) a method for modeling temporal errors as spatial errors, (2) a metadata representation that combines features of both object - and pointer-based approaches, and (3) a dataflow representation that simplifies optimizations for removing unneeded checks. MemSafe is capable of detecting real errors with lower overheads than previous efforts. Experimental results show that MemSafe detects all memory errors in 6 programs with known violations and ensures complete safety with an average overhead of 87% on 30 large programs widely-used in evaluating error detection tools.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115396969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 87

Parallel Reachability and Escape Analyses 并行可达性和逸出分析

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.10

Marcus Edvinsson, Jonas Lundberg, Welf Löwe

引用次数: 3

Deriving Coupling Metrics from Call Graphs 从调用图派生耦合度量

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.25

Simon Allier, S. Vaucher, Bruno Dufour, H. Sahraoui

引用次数: 18

Recovering the Memory Behavior of Executable Programs 恢复可执行程序的内存行为

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.18

A. Ketterlin, P. Clauss

{"title":"Recovering the Memory Behavior of Executable Programs","authors":"A. Ketterlin, P. Clauss","doi":"10.1109/SCAM.2010.18","DOIUrl":"https://doi.org/10.1109/SCAM.2010.18","url":null,"abstract":"This paper deals with the binary analysis of executable programs, with the goal of understanding how they access memory. It explains how to statically build a formal model of all memory accesses. Starting with a control-flow graph of each procedure, well-known techniques are used to structure this graph into a hierarchy of loops in all cases. The paper shows that much more information can be extracted by performing a complete data-flow analysis over machine registers after the program has been put in static single assignment (SSA) form. By using the SSA form, registers used in addressing memory can be symbolically expressed in terms of other, previously set registers. By including the loop structures in the analysis, loop indices and trip counts can also often be expressed symbolically. The whole process produces a formal model made of loops where memory accesses are linear expressions of loop counters and registers. The paper provides a quantitative evaluation of the results when applied to several dozens of SPEC benchmark programs. Because static analysis is often incomplete, the paper ends by describing a lightweight instrumentation strategy that collects at run time enough information to complete the program's symbolic description.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"18 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114045064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Reconstruction of Composite Types for Decompilation 用于反编译的复合类型重构

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.24

K. Troshina, Yegor Derevenets, A. Chernov

{"title":"Reconstruction of Composite Types for Decompilation","authors":"K. Troshina, Yegor Derevenets, A. Chernov","doi":"10.1109/SCAM.2010.24","DOIUrl":"https://doi.org/10.1109/SCAM.2010.24","url":null,"abstract":"Decompilation is reconstruction of a program in a high-level language from a program in a low-level language. This paper presents a method for automatic reconstruction of composite types (structures, arrays and combinations of them)in a high-level program during decompilation. Assembly code is obtained by disassembling a binary code or traces collected by a simulator. The proposed method is based on expressing memory access operations as pairs base offset, then building equivalence classes for the bases used in the program and accumulating offsets for each equivalence class. For Strictly conforming C programs our approach is substantiated by the C language semantics as defined in the international standard. However, experimental results have revealed that it is applicable for real-world programs also. Experimental results are obtained for a number of open-source programs as well as for traces collected from them. The method is an essential part of the tool for program decompilation TyDec being developed by the authors. Decompiler TyDec can be used as a standalone tool or as a plug-in for Interactive Trace Explorer TrEx being developed in Institute for System Programming, Russian Academy of Sciences.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128421760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Subclass Instantiation Distribution 子类实例化分布

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.12

Amy Wheeler, D. Binkley

引用次数: 0

AMBIDEXTER: Practical Ambiguity Detection 实用的歧义检测

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.21

Bas Basten, T. Storm

引用次数: 10

Estimating the Optimal Number of Latent Concepts in Source Code Analysis 估计源代码分析中潜在概念的最优数量

2010 10th IEEE Working Conference on Source Code Analysis and Manipulation Pub Date : 2010-09-12 DOI: 10.1109/SCAM.2010.22

Scott Grant, J. Cordy

引用次数: 69