{"title":"Socio-technical evolution of the Ruby ecosystem in GitHub","authors":"Eleni Constantinou, T. Mens","doi":"10.1109/SANER.2017.7884607","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884607","url":null,"abstract":"The evolution dynamics of a software ecosystem depend on the activity of the developer community contributing to projects within it. Both social and technical changes affect an ecosystem's evolution and the research community has been investigating the impact of these modifications over the last few years. Existing studies mainly focus on temporary modifications, often ignoring the effect of permanent changes on the software ecosystem. We present an empirical study of the magnitude and effect of permanent modifications in both the social and technical parts of a software ecosystem. More precisely, we measure permanent changes with regard to the ecosystem's projects, contributors and source code files and present our findings concerning the effect of these modifications. We study the Ruby ecosystem in GitHub over a nine-year period by carrying out a socio-technical analysis of the co-evolution of a large number of base projects and their forks. This analysis involves both the source code developed for these projects as well as the developers having contributed to them. We discuss our findings with respect to the ecosystem evolution according to three different viewpoints: (1) the base projects, (2) the forks and (3) the entire ecosystem containing both the base projects and forks. Our findings show an increased growth in both the technical and social aspects of the Ruby ecosystem until early 2014, followed by an increased contributor and project abandonment rate. We show the effect of permanent modifications in the ecosystem evolution and provide preliminary evidence of contributors migrating to other ecosystems when leaving the Ruby ecosystem.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"31 1","pages":"34-44"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80015324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"c-JRefRec: Change-based identification of Move Method refactoring opportunities","authors":"Naoya Ujihara, Ali Ouni, T. Ishio, Katsuro Inoue","doi":"10.1109/SANER.2017.7884658","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884658","url":null,"abstract":"We propose, in this paper, a lightweight refactoring recommendation tool, namely c-JRefRec, to identify Move Method refactoring opportunities based on four heuristics using static and semantic program analysis. Our tool aims at identiying refactoring opportunities before a code change is committed to the codebase based on current code changes whenever the developer saves/compiles his code. We evaluate the efficiency of our approach in detecting Feature Envy smells and recommending Move Method refactorings to fix them on three Java open-source systems and 30 code changes. Results show that our approach achieves an average precision of 0.48 and 0.73 of recall and outperforms a state-of-the-art approach namely JDeodorant.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"27 1","pages":"482-486"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79889260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Preetha Chatterjee, Manziba Akanda Nishi, Kostadin Damevski, Vinay Augustine, L. Pollock, Nicholas A. Kraft
{"title":"What information about code snippets is available in different software-related documents? An exploratory study","authors":"Preetha Chatterjee, Manziba Akanda Nishi, Kostadin Damevski, Vinay Augustine, L. Pollock, Nicholas A. Kraft","doi":"10.1109/SANER.2017.7884638","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884638","url":null,"abstract":"A large corpora of software-related documents is available on the Web, and these documents offer the unique opportunity to learn from what developers are saying or asking about the code snippets that they are discussing. For example, the natural language in a bug report provides information about what is not functioning properly in a particular code snippet. Previous research has mined information about code snippets from bug reports, emails, and Q&A forums. This paper describes an exploratory study into the kinds of information that is embedded in different software-related documents. The goal of the study is to gain insight into the potential value and difficulty of mining the natural language text associated with the code snippets found in a variety of software-related documents, including blog posts, API documentation, code reviews, and public chats.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"87 1","pages":"382-386"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86052901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ra-Jeong Moon, Kyu-Min Shim, Hae Young Lee, Hyung-Jong Kim
{"title":"Log generation for coding behavior analysis: For focusing on how kids are coding not what they are coding","authors":"Ra-Jeong Moon, Kyu-Min Shim, Hae Young Lee, Hyung-Jong Kim","doi":"10.1109/SANER.2017.7884684","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884684","url":null,"abstract":"Block programming lowers the barrier for programming learners and it is used in many software education program. Based on our observation, we realized that there are differences in way of learning and time of finishing goals even in under same instructors. To know the cause of this difference we propose a logging function to see the coding behavior of programmers. In this work we have developed library for generating log of developer's behavior in the process of block programming and defined required common items in creating block log process. In addition, we present the coding characteristics from the log, available information for deriving coding characteristics and detail criteria for deriving each characteristic. The contribution of this work is in development of framework generating logs of block programming process. This work will contribute to understand the programming learners' behaviors and enable instructors to design the learning courses properly.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"52 1","pages":"575-576"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84661948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-adaptive systems framework based on agent and search-based optimization","authors":"Liu He, Qingshan Li, Lu Wang, Jiewen Wan","doi":"10.1109/SANER.2017.7884675","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884675","url":null,"abstract":"Future-generation SASs need to have the adaptive abilities to efficiently handle changes from different sources and to mitigate conflicts caused by multiple simultaneous changes. However, existing methods cannot simultaneously make Future-generation SASs have the above abilities. This paper proposes an adaptive system framework based on agent technology and search-based software engineering technology (SBSE) for developing future-generation SASs with above-mentioned abilities. The framework integrates a hybrid adaptation logic based on agents to deal with various software changes from different layers, and an adaptation planning method with search-based optimization mechanism to mitigate conflicts caused by multiple simultaneous changes.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"64 1","pages":"557-558"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84015276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data access visualization for legacy application maintenance","authors":"Keisuke Yano, Akihiko Matsuo","doi":"10.1109/SANER.2017.7884671","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884671","url":null,"abstract":"Software clustering techniques have been studied and applied to analyze and visualize the actual structure of legacy applications, which have used program information, e.g., dependencies, as input. However, business data also play an important role in a business system. Revealing which programs actually use data in the current system can give us a key insight when analyzing a long-lived complicated system. In this paper, we calculate indexes indicating how a data entity is used, making use of software clustering, which can be used to detect problematic or characteristic parts of the system. The developed technique can reveal the characteristics of a data entity; i.e., it is used like master data. We applied this technique to two business systems used for many years and found that our technique can help us understand the systems in terms of business data usage. Through case studies, we evaluated the validity of the indexes and showed that software visualization with the indexes can be used to investigate a system in an exploratory way.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"63 1","pages":"546-550"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80320212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harald Altinger, S. Herbold, F. Schneemann, J. Grabowski, F. Wotawa
{"title":"Performance tuning for automotive Software Fault Prediction","authors":"Harald Altinger, S. Herbold, F. Schneemann, J. Grabowski, F. Wotawa","doi":"10.1109/SANER.2017.7884667","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884667","url":null,"abstract":"Fault prediction on high quality industry grade software often suffers from strong imbalanced class distribution due to a low bug rate. Previous work reports on low predictive performance, thus tuning parameters is required. As the State of the Art recommends sampling methods for imbalanced learning, we analyse effects when under- and oversampling the training data evaluated on seven different classification algorithms. Our results demonstrate settings to achieve higher performance values but the various classifiers are influenced in different ways. Furthermore, not all performance reports can be tuned at the same time.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"25 1","pages":"526-530"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81986227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"StiCProb: A novel feature mining approach using conditional probability","authors":"Yutian Tang, H. Leung","doi":"10.1109/SANER.2017.7884608","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884608","url":null,"abstract":"Software Product Line Engineering is a key approach to construct applications with systematical reuse of architecture, documents and other relevant components. To migrate legacy software into a product line system, it is essential to identify the code segments that should be constructed as features from the source base. However, this could be an error-prone and complicated task, as it involves exploring a complex structure and extracting the relations between different components within a system. And normally, representing structural information of a program in a mathematical way should be a promising direction to investigate. We improve this situation by proposing a probability-based approach named StiCProb to capture source code fragments for feature concerned, which inherently provides a conditional probability to describe the closeness between two programming elements. In the case study, we conduct feature mining on several legacy systems, to compare our approach with other related approaches. As demonstrated in our experiment, our approach could support developers to locate features within legacy successfully with a better performance of 83% for precision and 41% for recall.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"1 1","pages":"45-55"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78795551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christian D. Newman, Reem S. Alsuhaibani, M. Collard, Jonathan I. Maletic
{"title":"Lexical categories for source code identifiers","authors":"Christian D. Newman, Reem S. Alsuhaibani, M. Collard, Jonathan I. Maletic","doi":"10.1109/SANER.2017.7884624","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884624","url":null,"abstract":"A set of lexical categories, analogous to part-of-speech categories for English prose, is defined for source-code identifiers. The lexical category for an identifier is determined from its declaration in the source code, syntactic meaning in the programming language, and static program analysis. Current techniques for assigning lexical categories to identifiers use natural-language part-of-speech taggers. However, these NLP approaches assign lexical tags based on how terms are used in English prose. The approach taken here differs in that it uses only source code to determine the lexical category. The approach assigns a lexical category to each identifier and stores this information along with each declaration. srcML is used as the infrastructure to implement the approach and so the lexical information is stored directly in the srcML markup as an additional XML element for each identifier. These lexical-category annotations can then be later used by tools that automatically generate such things as code summarization or documentation. The approach is applied to 50 open source projects and the soundness of the defined lexical categories evaluated. The evaluation shows that at every level of minimum support tested, categorization is consistent at least 79% of the time with an overall consistency (across all supports) of at least 88%. The categories reveal a correlation between how an identifier is named and how it is declared. This provides a syntax-oriented view (as opposed to English part-of-speech view) of developer intent of identifiers.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"109 1","pages":"228-239"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79236047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laerte Xavier, Aline Brito, André C. Hora, M. T. Valente
{"title":"Historical and impact analysis of API breaking changes: A large-scale study","authors":"Laerte Xavier, Aline Brito, André C. Hora, M. T. Valente","doi":"10.1109/SANER.2017.7884616","DOIUrl":"https://doi.org/10.1109/SANER.2017.7884616","url":null,"abstract":"Change is a routine in software development. Like any system, libraries also evolve over time. As a consequence, clients are compelled to update and, thus, benefit from the available API improvements. However, some of these API changes may break contracts previously established, resulting in compilation errors and behavioral changes. In this paper, we study a set of questions regarding API breaking changes. Our goal is to measure the amount of breaking changes on real-world libraries and its impact on clients at a large-scale level. We assess (i) the frequency of breaking changes, (ii) the behavior of these changes over time, (iii) the impact on clients, and (iv) the characteristics of libraries with high frequency of breaking changes. Our large-scale analysis on 317 real-world Java libraries, 9K releases, and 260K client applications shows that (i) 14.78% of the API changes break compatibility with previous versions, (ii) the frequency of breaking changes increases over time, (iii) 2.54% of their clients are impacted, and (iv) systems with higher frequency of breaking changes are larger, more popular, and more active. Based on these results, we provide a set of lessons to better support library and client developers in their maintenance tasks.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"6 1","pages":"138-147"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83482254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}