{"title":"Automatically identifying focal methods under test in unit test cases","authors":"Mohammad Ghafari, C. Ghezzi, K. Rubinov","doi":"10.1109/SCAM.2015.7335402","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335402","url":null,"abstract":"Modern iterative and incremental software development relies on continuous testing. The knowledge of test-to-code traceability links facilitates test-driven development and improves software evolution. Previous research identified traceability links between test cases and classes under test. Though this information is helpful, a finer granularity technique can provide more useful information beyond the knowledge of the class under test. In this paper, we focus on Java classes that instantiate stateful objects and propose an automated technique for precise detection of the focal methods under test in unit test cases. Focal methods represent the core of a test scenario inside a unit test case. Their main purpose is to affect an object's state that is then checked by other inspector methods whose purpose is ancillary and needs to be identified as such. Distinguishing focal from other (non-focal) methods is hard to accomplish manually. We propose an approach to detect focal methods under test automatically. An experimental assessment with real-world software shows that our approach identifies focal methods under test in more than 85% of cases, providing a ground for precise automatic recovery of test-to-code traceability links.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117345500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gábor Szoke, Csaba Nagy, Lajos Jeno Fülöp, R. Ferenc, T. Gyimóthy
{"title":"FaultBuster: An automatic code smell refactoring toolset","authors":"Gábor Szoke, Csaba Nagy, Lajos Jeno Fülöp, R. Ferenc, T. Gyimóthy","doi":"10.1109/SCAM.2015.7335422","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335422","url":null,"abstract":"One solution to prevent the quality erosion of a software product is to maintain its quality by continuous refac-toring. However, refactoring is not always easy. Developers need to identify the piece of code that should be improved and decide how to rewrite it. Furthermore, refactoring can also be risky; that is, the modified code needs to be re-tested, so developers can see if they broke something. Many IDEs offer a range of refactorings to support so-called automatic refactoring, but tools which are really able to automatically refactor code smells are still under research. In this paper we introduce FaultBuster, a refactoring toolset which is able to support automatic refactoring: identifying the problematic code parts via static code analysis, running automatic algorithms to fix selected code smells, and executing integrated testing tools. In the heart of the toolset lies a refactoring framework to control the analysis and the execution of automatic algorithms. FaultBuster provides IDE plugins to interact with developers via popular IDEs (Eclipse, Netbeans and IntelliJ IDEA). All the tools were developed and tested in a 2-year project with 6 software development companies where thousands of code smells were identified and fixed in 5 systems having altogether over 5 million lines of code.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115106190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using changeset descriptions as a data source to assist feature location","authors":"Muslim Chochlov, M. English, J. Buckley","doi":"10.1109/SCAM.2015.7335401","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335401","url":null,"abstract":"Feature location attempts to assist developers in discovering functionality in source code. Many textual feature location techniques utilize information retrieval and rely on comments and identifiers of source code to describe software entities. An interesting alternative would be to employ the changeset descriptions of the code altered in that changeset as a data source to describe such software entities. To investigate this we implement a technique utilizing changeset descriptions and conduct an empirical study to observe this technique's overall performance. Moreover, we study how the granularity (i.e. file or method level of software entities) and changeset range inclusion (i.e. most recent or all historical changesets) affect such an approach. The results of a preliminary study with Rhino and Mylyn. Tasks systems suggest that the approach could lead to a potentially efficient feature location technique. They also suggest that it is advantageous in terms of the effort to configure the technique at method level granularity and that older changesets from older systems may reduce the effectiveness of the technique.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133735864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Navigating source code with words","authors":"Dawn J Lawrie, D. Binkley","doi":"10.1109/SCAM.2015.7335403","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335403","url":null,"abstract":"The hierarchical method of organizing information has proven beneficial in learning in part because it maps well onto the human brain's memory. Exploiting this organizational strategy may help engineers cope with large software systems. In fact such an strategy is already present in source code and is manifested in the class hierarchies of objected-oriented programs. However, an engineer faced with fixing a bug or any similar need to locate the implementation of a particular feature in the code is less interested in the syntactic organization of the code and more interested in its conceptual organization. Therefore, a conceptual hierarchy would bring clear benefit. Fortunately, such a view can be extracted automatically the source code. The hierarchy generating tool HierIT performs this task using an information-theoretic approach to identify “content-bearing” words and associate them hierarchically. The resulting hierarchy enables an engineer to better understand the concepts contained in a software system. To study their value, an experiment was conducted to quantitatively and qualitatively investigate the value that hierarchies bring. The quantitative evaluation first considers the Expected Mutual Information Measure (EMIM) between the set of topic words and natural language extracted from the source code. It then considers the Best Case Tree Walk (BCTW), which captures how “expensive” it is to find interesting documents. Finally, the hierarchies are considered qualitatively by investigating their perceived usefulness in a case study involving three engineers.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133824684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jens Nicolay, Carlos Noguera, Coen De Roover, W. Meuter
{"title":"Detecting function purity in JavaScript","authors":"Jens Nicolay, Carlos Noguera, Coen De Roover, W. Meuter","doi":"10.1109/SCAM.2015.7335406","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335406","url":null,"abstract":"We present an approach to detect function purity in JavaScript. A function is pure if none of its applications cause observable side-effects. The approach is based on a pushdown flow analysis that besides traditional control and value flow also keeps track of write effects. To increase the precision of our purity analysis, we combine it with an intraprocedural analysis to determine freshness of variables and object references. We formalize the core aspects of our analysis, and discuss our implementation used to analyze several common JavaScript benchmarks. Experiments show that our technique is capable of detecting function purity, even in the presence of higher-order functions, dynamic property expressions, and prototypal inheritance.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123465388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Binkley, N. Gold, M. Harman, Syed S. Islam, J. Krinke, S. Yoo
{"title":"ORBS and the limits of static slicing","authors":"D. Binkley, N. Gold, M. Harman, Syed S. Islam, J. Krinke, S. Yoo","doi":"10.1109/SCAM.2015.7335396","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335396","url":null,"abstract":"Observation-based slicing is a recently-introduced, language-independent slicing technique based on the dependencies observable from program behaviour. Due to the well-known limits of dynamic analysis, we may only compute an under-approximation of the true observation-based slice. However, because the observation-based slice captures all possible dependence that can be observed, even such approximations can yield insight into the limitations of static slicing. For example, a static slice, S, that is strictly smaller than the corresponding observation based slice is potentially unsafe. We present the results of three sets of experiments on 12 different programs, including benchmarks and larger programs, which investigate the relationship between static and observation-based slicing. We show that, in extreme cases, observation-based slices can find the true minimal static slice, where static techniques cannot. For more typical cases, our results illustrate the potential for observation-based slicing to highlight limitations in static slicers. Finally, we report on the sensitivity of observation-based slicing to test quality.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125765855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Otávio Augusto Lazzarini Lemos, A. C. D. Paula, Hitesh Sajnani, C. Lopes
{"title":"Can the use of types and query expansion help improve large-scale code search?","authors":"Otávio Augusto Lazzarini Lemos, A. C. D. Paula, Hitesh Sajnani, C. Lopes","doi":"10.1109/SCAM.2015.7335400","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335400","url":null,"abstract":"With the open source code movement, code search with the intent of reuse has become increasingly popular. So much so that researchers have been calling it the new facet of software reuse. Although code search differs from general-purpose document search in essential ways, most tools still rely mainly on keywords matched against source code text. Recently, researchers have proposed more sophisticated ways to perform code search, such as including interface definitions in the queries (e.g., return and parameter types of the desired function, along with keywords; called here Interface-Driven Code Search - IDCS). However, to the best of our knowledge, there are few empirical studies that compare traditional keyword-based code search (KBCS) with more advanced approaches such as IDCS. In this paper we describe an experiment that compares the effectiveness of KBCS with IDCS in the task of large-scale code search of auxiliary functions implemented in Java. We also measure the impact of query expansion based on types and WordNet on both approaches. Our experiment involved 36 subjects that produced real-world queries for 16 different auxiliary functions and a repository with more than 2,000,000 Java methods. Results show that the use of types can improve recall and the number of relevant functions returned (#RFR) when combined with query expansion (~30% improvement in recall, and ~43% improvement in #RFR). However, a more detailed analysis suggests that in some situations it is best to use keywords only, in particular when these are sufficient to semantically define the desired function.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122324821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"When code smells twice as much: Metric-based detection of variability-aware code smells","authors":"W. Fenske, Sandro Schulze, Daniel Meyer, G. Saake","doi":"10.1109/SCAM.2015.7335413","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335413","url":null,"abstract":"Code smells are established, widely used characterizations of shortcomings in design and implementation of software systems. As such, they have been subject to intensive research regarding their detection and impact on understandability and changeability of source code. However, current methods do not support highly configurable software systems, that is, systems that can be customized to fit a wide range of requirements or platforms. Such systems commonly owe their configurability to conditional compilation based on C preprocessor annotations (a. k. a. #ifdefs). Since annotations directly interact with the host language (e. g., C), they may have adverse effects on understandability and changeability of source code, referred to as variability-aware code smells. In this paper, we propose a metric-based method that integrates source code and C preprocessor annotations to detect such smells. We evaluate our method for one specific smell on five open-source systems of medium size, thus, demonstrating its general applicability. Moreover, we manually reviewed 100 instances of the smell and provide a qualitative analysis of its potential impact as well as common causes for the occurrence.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134393990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Bonifácio, Fausto Carvalho, G. N. Ramos, U. Kulesza, Roberta Coelho
{"title":"The use of C++ exception handling constructs: A comprehensive study","authors":"R. Bonifácio, Fausto Carvalho, G. N. Ramos, U. Kulesza, Roberta Coelho","doi":"10.1109/SCAM.2015.7335398","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335398","url":null,"abstract":"Exception handling (EH) is a well-known mechanism that aims at improving software reliability in a modular way - allowing a better separation between the code that deals with exceptional conditions and the code that deals with the normal control flow of a program. Although the exception handling mechanism was conceived almost 40 years ago, formulating a reasonable design of exception handling code is still considered a challenge, which might hinder its widespread use. This paper reports the results of an empirical study that use a mixed-method approach to investigate the adoption of the exception handing mechanism in C++. Firstly, we carried out a static analysis investigation to understand how developers employ the exception handling construct of C++, considering 65 open-source systems (which comprise 34 million lines of C++ code overall). Then, to better understand the findings from the static analysis phase, we conducted a survey involving 145 C++ developers who have contributed to the subject systems. Some of the findings consistently detected during this mixed-method study reveal that, for several projects, the use of exception handling constructs is scarce and developers favor the use of other strategies to deal with exceptional conditions. In addition, the survey respondents consider that incompatibility with existing C code and libraries, extra performance costs (in terms of response time and size of the compiled code), and lack of expertise to design an exception handling strategy are among the reasons for avoiding the use of exception handling constructs.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117281178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The impact of cross-distribution bug duplicates, empirical study on Debian and Ubuntu","authors":"Vincent Boisselle, Bram Adams","doi":"10.1109/SCAM.2015.7335409","DOIUrl":"https://doi.org/10.1109/SCAM.2015.7335409","url":null,"abstract":"Although open source distributions like Debian and Ubuntu are closely related, sometimes a bug reported in the Debian bug repository is reported independently in the Ubuntu repository as well, without the Ubuntu users nor developers being aware. Such cases of undetected cross-distribution bug duplicates can cause developers and users to lose precious time working on a fix that already exists or to work individually instead of collaborating to find a fix faster. We perform a case study on Ubuntu and Debian bug repositories to measure the amount of cross-distribution bug duplicates and estimate the amount of time lost. By adapting an existing within-project duplicate detection approach (achieving a similar recall of 60%), we find 821 cross-duplicates. The early detection of such duplicates could reduce the time lost by users waiting for a fix by a median of 38 days. Furthermore, we estimate that developers from the different distributions lose a median of 47 days in which they could have collaborated together, had they been aware of duplicates. These results show the need to detect and monitor cross-distribution duplicates.","PeriodicalId":192232,"journal":{"name":"2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115687215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}