G. Robles, J. Moreno-León, Efthimia Aivaloglou, F. Hermans
{"title":"Software clones in scratch projects: on the presence of copy-and-paste in computational thinking learning","authors":"G. Robles, J. Moreno-León, Efthimia Aivaloglou, F. Hermans","doi":"10.1109/IWSC.2017.7880506","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880506","url":null,"abstract":"Computer programming is being introduced in schools worldwide as part of a movement that promotes Computational Thinking (CT) skills among young learners. In general, learners use visual, block-based programming languages to acquire these skills, with Scratch being one of the most popular ones. Similar to professional developers, learners also copy and paste their code, resulting in duplication. In this paper we present the findings of correlating the assessment of the CT skills of learners with the presence of software clones in over 230,000 projects obtained from the Scratch platform. Specifically, we investigate i) if software cloning is an extended practice in Scratch projects, ii) if the presence of code cloning is independent of the programming mastery of learners, iii) if code cloning can be found more frequently in Scratch projects that require specific skills (as parallelism or logical thinking), and iv) if learners who have the skills to avoid software cloning really do so. The results show that i) software cloning can be commonly found in Scratch projects, that ii) it becomes more frequent as learners work on projects that require advanced skills, that iii) no CT dimension is to be found more related to the absence of software clones than others, and iv) that learners -even if they potentially know how to avoid cloning- still copy and paste frequently. The insights from this paper could be used by educators and learners to determine when it is pedagogically more effective to address software cloning, by educational programming platform developers to adapt their systems, and by learning assessment tools to provide better evaluations.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128397443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing program dependency graph based clone detection using approximate subgraph matching","authors":"C. M. Kamalpriya, Paramvir Singh","doi":"10.1109/IWSC.2017.7880511","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880511","url":null,"abstract":"Software code clone detection techniques and tools play a major role in improving the software quality as well as saving maintenance cost and effort. Program Dependency Graph (PDG) based clone detection techniques have a key advantage over other techniques as they are capable of detecting non-contiguous code clones in addition to contiguous clones. We propose further enhancement to current state of the art PDG-based detection to identify all possible (exact and approximate) clone relations from the obtained clone pair (PDG-based) results using Approximate Subgraph Matching (ASM). We obtain clone results of our proposed technique on three subject software systems, and validate the results on eclipse-ant from Bellon’s benchmark. We also present a new ASM-based distance measure to represent the similarity between software code clones.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134472753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kyohei Uemura, A. Mori, Kenji Fujiwara, Eunjong Choi, Hajimu Iida
{"title":"Detecting and analyzing code clones in HDL","authors":"Kyohei Uemura, A. Mori, Kenji Fujiwara, Eunjong Choi, Hajimu Iida","doi":"10.1109/IWSC.2017.7880501","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880501","url":null,"abstract":"In this paper, we study code clones in hardware description languages (HDLs) in comparison with general programming languages. For this purpose, we have developed a method for detecting code clones in Verilog HDL. A key idea of the proposed method is to convert the Verilog HDL code into the pseudo C++ code, which is then processed by an existing code clone detector for C++. We conducted an experiment on 10 open source hardware products described in Verilog HDL, where we succeeded in detecting nearly 1,800 clone sets with approximately 80% precision. We compared code clones in Verilog HDL with those in Java/C based on the metrics to identify the differences among languages. We identified patterns on how code clones are created in Verilog HDL, which include cases for increasing stability and capability of parallel processing of the circuit.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125886252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rethinking dependence clones","authors":"Tim A. D. Henderson, Andy Podgurski","doi":"10.1109/IWSC.2017.7880512","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880512","url":null,"abstract":"Semantic code clones are regions of duplicated code that may appear dissimilar but compute similar functions. Since in general it is algorithmically undecidable whether two or more programs compute the same function, locating all semantic code clones is infeasible. One way to dodge the undecidability issue and find potential semantic clones, using only static information, is to search for recurring subgraphs of a program dependence graph (PDG). PDGs represent control and data dependence relationships between statements or operations in a program. PDG-based clone detection techniques, unlike syntactically-based techniques, do not distinguish between code fragments that differ only because of dependence-preserving statement re-orderings, which also preserve semantics. Consequently, they detect clones that are difficult to find by other means. Despite this very desirable property, work on PDG-based clone detection has largely stalled, apparently because of concerns about the scalability of the approach. We argue, however, that the time has come to reconsider PDG-based clone detection, as a part of a holistic strategy for clone management. We present evidence that its scalability problems are not as severe as previously thought. This suggests the possibility of developing integrated clone management systems that fuse information from multiple clone detection methods, including PDG-based ones.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129402535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web-service for finding cloned files using b-bit minwise hashing","authors":"Kaoru Ito, T. Ishio, Katsuro Inoue","doi":"10.1109/IWSC.2017.7880504","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880504","url":null,"abstract":"Source code reuse is a common practice in software development. Since industrial developers may accidentally reuse source files developed by open source software, clone detection tools are used to detect open source files in their closed source project. To execute a clone detection, developers need a database of existing open source software. While a web-service providing clone detection using a centralized database is likely useful, industrial developers are not allowed to submit their source code to a public server on the Internet. To solve the problem, we employ b-bit minwise hashing technique that enables to estimate similarity of documents using only hash values of the documents. Using the method, we implemented a file-clone detection web service; it takes as input a hash value of a source file and returns a list of similar source files in existing open source software. Our hash comparison method is efficient, although an estimated similarity may have a margin of error.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127838492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anfernee Goon, Yuhao Wu, M. Matsushita, Katsuro Inoue
{"title":"Evolution of code clone ratios throughout development history of open-source C and C++ programs","authors":"Anfernee Goon, Yuhao Wu, M. Matsushita, Katsuro Inoue","doi":"10.1109/IWSC.2017.7880509","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880509","url":null,"abstract":"A code clone is a fragment of code which is duplicated throughout the source code of a project. Code clones have been shown to make a project less maintainable because all code clones will share potential bugs and problems. Unlike other code clone research, this study analyzes the code clone ratios over the entire development lifetime of three open-source projects written in C/C++ to understand code clone growth in software over development and potential developer habits which could affect this growth. The study utilizes CCFinderX and Git to detect clone metrics across development history. The results from each project show very low, stable ratios across development history, with the code clone ratios only fluctuating greatly during the beginning of development mostly and very little refactoring occurring. This study goes further into the potential cause of low ratios and different fluctuations at different periods of development.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126909340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Refactoring patterns study in code clones during software evolution","authors":"Jaweria Kanwal, Katsuro Inoue, O. Maqbool","doi":"10.1109/IWSC.2017.7880508","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880508","url":null,"abstract":"To investigate how code clones are handled by de-velopers when they perform refactorings during software releas-es, we performed a longitudinal study on different versions of five Java systems. Our results show that a small proportion of code clones are refactored during the releases and code clones of same clone class are refactored consistently.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131853273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A technique to detect multi-grained code clones","authors":"Yusuke Yuki, Yoshiki Higo, S. Kusumoto","doi":"10.1109/IWSC.2017.7880510","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880510","url":null,"abstract":"It is said that the presence of code clones makes software maintenance more difficult. For such a reason, it is important to understand how code clones are distributed in source code. A variety of code clone detection tools has been developed before now. Recently, some researchers have detected code clones from a large set of source code to find library candidates or overlooked bugs. In general, the smaller the granularity of the detection target is, the longer the detection time. On the other hand, the larger the granularity of the detection target is, the fewer detectable code clones are. In this paper, we propose a technique that detects in order from coarse code clones to fine-grained ones. In the coarse-tofine- grained-detections, code fragments detected as code clones at a certain granularity are excluded from detection targets of more fine-grained detections. Our proposed technique can detect code clones faster than fine-grained detection techniques. Besides, it can detect more code clones than coarse detection techniques.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125230854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rearranging the order of program statements for code clone detection","authors":"Yusuke Sabi, Yoshiki Higo, S. Kusumoto","doi":"10.1109/IWSC.2017.7880503","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880503","url":null,"abstract":"A code clone is a code fragment identical or similar to another code fragment in source code. Some of code clones are considered as a factor of bug replications and make it more difficult to maintain software. Various code clone detection tools have been proposed so far. However, in most algorithms adopted by existing clone detection tools, if program statements are reordered, they are not detected as code clones. In this research, we examined how clone detection results change by rearranging the order of program statements. We performed preprocessing to rearranging the order of program statements using program dependency graph (PDG). We compared clone detection results with and without preprocessing. As a result, by rearranging the order of program statements, the number of detected code clones is almost the same in most projects. We classified newly detected or disappeared clones manually. From our experimental results, we show that there is no newly detected clone whose statements are reordered and that there are four disappeared clones whose statements are reordered. We think three out of the four clones occurred by copy-and-paste operations. Therefore, we conclude that rearranging the order of program statements is not effective to detect reordered code clones.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130898303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Does cloned code increase maintenance effort?","authors":"Manishankar Mondal, C. Roy, Kevin A. Schneider","doi":"10.1109/IWSC.2017.7880507","DOIUrl":"https://doi.org/10.1109/IWSC.2017.7880507","url":null,"abstract":"In-spite of a number of in-depth investigations regarding the impact of clones in the maintenance phase there is no concrete answer to the long lived research question, “Does the presence of code clones increase maintenance effort?”. Existing studies have measured different change related metrics for cloned and non-cloned regions, however, no study calculates the maintenance effort spent for these code regions. In this paper, we perform an in-depth empirical study in order to compare the maintenance efforts required for cloned and non-cloned code. For the purpose of our study we implement a prototype tool which is capable of estimating the effort spent by a developer for changing a particular method. It can also predict effort that might need to be spent for making some changes to a particular method. Our estimation and prediction involve automatic extraction and analysis of the entire evolution history of a candidate software system. We applied our tool on hundreds of revisions of six open source subject systems written in three different programming languages for calculating the efforts spent for cloned and non-cloned code. According to our experimental results: (i) cloned code requires more effort in the maintenance phase than non-cloned code, and (ii) Type 2 and Type 3 clones require more effort compared to the efforts required by Type 1 clones. According to our findings, we should prioritize Type 2 and Type 3 clones when making clone management decisions.","PeriodicalId":222231,"journal":{"name":"2017 IEEE 11th International Workshop on Software Clones (IWSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117008466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}