Heidar Pirzadeh, Sara Shanian, A. Hamou-Lhadj, A. Mehrabian
{"title":"The Concept of Stratified Sampling of Execution Traces","authors":"Heidar Pirzadeh, Sara Shanian, A. Hamou-Lhadj, A. Mehrabian","doi":"10.1109/ICPC.2011.17","DOIUrl":"https://doi.org/10.1109/ICPC.2011.17","url":null,"abstract":"Execution traces can be overwhelmingly large. To reduce their size, sampling techniques, especially the ones based on random sampling, have been extensively used. Random sampling, however, may result in samples that are not representative of the original trace. We propose a trace sampling framework based on stratified sampling that not only reduces the size of a trace but also results in a sample that is representative of the original trace by ensuring that the desired characteristics of an execution are distributed similarly in both the sampled and the original trace.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"109 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125849792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Framework API Evolution as a Multi-objective Optimization Problem","authors":"Wei Wu","doi":"10.1109/ICPC.2011.43","DOIUrl":"https://doi.org/10.1109/ICPC.2011.43","url":null,"abstract":"Today's software development depends greatly on frameworks and libraries. When their APIs evolve, developers must update their programs accordingly. Existing approaches facilitate the upgrading process by generating change -- rules based on various input data, such call dependency, text similarity, software metrics, etc. However, existing approaches do not provide 100% precision and recall because of the limited set of input data that they use to generate change -- rules. For example, an approach only considering text similarity usually discovers less change -- rules then that considering both text similarity and call dependency with similar precision. But adding more input data may increase the complexity of the change -- rule generating algorithms and make them unpractical. We propse MOFAE (Multi-Objective Framework API Evolution) by modeling framework API evolution as multi-objective optimization problem to take more input data into account while generating change -- rules and to control the algorithmic complexity.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130240022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conflict-Aware Optimal Scheduling of Code Clone Refactoring: A Constraint Programming Approach","authors":"M. Zibran, C. Roy","doi":"10.1109/ICPC.2011.45","DOIUrl":"https://doi.org/10.1109/ICPC.2011.45","url":null,"abstract":"Duplicated code, also known as code clones, are one of the malicious `code smells' that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131287550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bogdan Dit, Latifa Guerrouj, D. Poshyvanyk, G. Antoniol
{"title":"Can Better Identifier Splitting Techniques Help Feature Location?","authors":"Bogdan Dit, Latifa Guerrouj, D. Poshyvanyk, G. Antoniol","doi":"10.1109/ICPC.2011.47","DOIUrl":"https://doi.org/10.1109/ICPC.2011.47","url":null,"abstract":"The paper presents an exploratory study of two feature location techniques utilizing three strategies for splitting identifiers: Camel Case, Samurai and manual splitting of identifiers. The main research question that we ask in this study is if we had a perfect technique for splitting identifiers, would it still help improve accuracy of feature location techniques applied in different scenarios and settings? In order to answer this research question we investigate two feature location techniques, one based on Information Retrieval and the other one based on the combination of Information Retrieval and dynamic analysis, for locating bugs and features using various configurations of preprocessing strategies on two open-source systems, Rhino and jEdit. The results of an extensive empirical evaluation reveal that feature location techniques using Information Retrieval can benefit from better preprocessing algorithms in some cases, and that their improvement in effectiveness while using manual splitting over state-of-the-art approaches is statistically significant in those cases. However, the results for feature location technique using the combination of Information Retrieval and dynamic analysis do not show any improvement while using manual splitting, indicating that any preprocessing technique will suffice if execution data is available. Overall, our findings outline potential benefits of putting additional research efforts into defining more sophisticated source code preprocessing techniques as they can still be useful in situations where execution information cannot be easily collected.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130326956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MTF: A Scalable Exchange Format for Traces of High Performance Computing Systems","authors":"L. Alawneh, A. Hamou-Lhadj","doi":"10.1109/ICPC.2011.15","DOIUrl":"https://doi.org/10.1109/ICPC.2011.15","url":null,"abstract":"Execution traces generated from running high performance computing applications (HPC) may reach tens or hundreds of gigabytes. The trace data can be used for visualization, analysis of profiling information about the target system. However, in order to make the utilization of this data efficient, the trace needs to be represented in a structure that facilitates the access to its data. One important factor that should be considered when representing trace data is scalability; the trace met model should be able to represent the trace in a compact form that enables scalability of the analysis tools. Additionally, a trace file needs to be available in a format that is well-known in the software engineering area by making it open. In this paper, we propose a metamodel for representing dynamic information generated from HPC that use the MPI standard as the inter-process communication model. MPI Trace Format (MTF) is meant to meet the aforementioned requirements and is intended to facilitate the interoperability among different trace analysis tools.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130803609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Faceted Navigation for Software Exploration","authors":"Nan Niu, Anas Mahmoud, Xiaoyong Yang","doi":"10.1109/ICPC.2011.18","DOIUrl":"https://doi.org/10.1109/ICPC.2011.18","url":null,"abstract":"Much of developers' time is spent in exploring and understanding an unfamiliar software space. In this paper, we present a novel approach that characterizes the code fragments along several orthogonal dimensions in order for developers to navigate complex software spaces in a flexible manner. Central to our approach are hierarchical faceted categories (HFC), which have become especially successful in supporting exploratory web search activities. We apply the HFC approach for exploring a sizeable open-source software system. Our preliminary evaluation shows that HFC are promising in supporting software exploration tasks.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133729600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DebCheck: Efficient Checking for Open Source Code Clones in Software Systems","authors":"J. Cordy, C. Roy","doi":"10.1109/ICPC.2011.27","DOIUrl":"https://doi.org/10.1109/ICPC.2011.27","url":null,"abstract":"The problem of finding code cloned from open source code in software systems is of interest both to the open source community (e.g., for GPL and other open source license enforcement) and the industrial community (e.g., to prevent GPL \"contamination\" of proprietary commercial software systems). The largest collection of open source software in general distribution is the collection of eight DVDs in the Debian source distribution, and checking for cross-cloning with the Debian source distribution goes a long way towards finding any possible copying from the set of all open source code in the world. The NiCad clone detector is an open source language- sensitive robust clone detector that has been shown to yield both high precision and high recall in detecting syntactically meaningful near-miss clones such as functions and blocks. Given a directory of new source code to check, DebCheck uses NiCad in its incremental mode to efficiently check the system for near-miss clones of C functions in the entire Debian source base in a few minutes on a 2 Gb home computer. The same technique can be used to check systems for cross-clones with any large source collection.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116131604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Juergens, Martin Feilkas, Markus Herrmannsdoerfer, F. Deissenboeck, Rudolf Vaas, Karl-Heinz Prommer
{"title":"Feature Profiling for Evolving Systems","authors":"E. Juergens, Martin Feilkas, Markus Herrmannsdoerfer, F. Deissenboeck, Rudolf Vaas, Karl-Heinz Prommer","doi":"10.1109/ICPC.2011.12","DOIUrl":"https://doi.org/10.1109/ICPC.2011.12","url":null,"abstract":"Traditionally, most work in program comprehension focuses on understanding the inner workings of software systems. However, for many software maintenance tasks, not only a sound understanding of a system's implementation but also comprehensive and accurate information about the way users actually use a system's features is of crucial importance. Such information e.g. helps to determine the impact that a specific change has on the users of a system. In practice, however, this information is often not available. We propose an approach called feature profiling as a means to efficiently gather usage information to support maintenance tasks that affect the user interface of a software system. Furthermore, we present tool support for feature profiling and report on a case study in the insurance domain. In this study, we profiled the features of an application that is used by 150 users in 10 countries over a period of five months.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124651348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rachel Burrows, François Taïani, Alessandro F. Garcia, F. Ferrari
{"title":"Reasoning about Faults in Aspect-Oriented Programs: A Metrics-Based Evaluation","authors":"Rachel Burrows, François Taïani, Alessandro F. Garcia, F. Ferrari","doi":"10.1109/ICPC.2011.30","DOIUrl":"https://doi.org/10.1109/ICPC.2011.30","url":null,"abstract":"Aspect-oriented programming (AOP) aims at facilitating program comprehension and maintenance in the presence of crosscutting concerns. Aspect code is often introduced and extended as the software projects evolve. Unfortunately, we still lack a good understanding of how faults are introduced in evolving aspect-oriented programs. More importantly, there is little knowledge whether existing metrics are related to typical fault introduction processes in evolving aspect-oriented code. This paper presents an exploratory study focused on the analysis of how faults are introduced during maintenance tasks involving aspects. The results indicate a recurring set of fault patterns in this context, which can better inform the design of future metrics for AOP. We also pinpoint AOP-specific fault categories which are difficult to detect with popular metrics for fault-proneness, such as coupling and code churn.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127200296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Capturing Expert Knowledge for Automated Configuration Fault Diagnosis","authors":"Mengliao Wang, Xiaoyu Shi, Kenny Wong","doi":"10.1109/ICPC.2011.24","DOIUrl":"https://doi.org/10.1109/ICPC.2011.24","url":null,"abstract":"The process of manually diagnosing a software misconfiguration problem is time consuming. Manually writing and updating rules to detect future problems is still the state of the practice. Consequently, there is a need for increased automation. In this paper, we propose a three-phase framework using machine learning techniques for automated configuration faults diagnosis. This system can also help in capturing expert knowledge of configuration troubleshooting. Our experiments on Apache web server configurations are generally encouraging and non-experts can use this system to diagnose misconfigurations effectively.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129229542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}