{"title":"Towards more accurate content categorization of API discussions","authors":"Bo Zhou, Xin Xia, D. Lo, Cong Tian, Xinyu Wang","doi":"10.1145/2597008.2597142","DOIUrl":"https://doi.org/10.1145/2597008.2597142","url":null,"abstract":"Nowadays, software developers often discuss the usage of various APIs in online forums. Automatically assigning pre-defined semantic categorizes to API discussions in these forums could help manage the data in online forums, and assist developers to search for useful information. We refer to this process as content categorization of API discussions. To solve this problem, Hou and Mo proposed the usage of naive Bayes multinomial, which is an effective classification algorithm. \u0000 In this paper, we propose a Cache-bAsed compoSitE algorithm, short formed as CASE, to automatically categorize API discussions. Considering that the content of an API discussion contains both textual description and source code, CASE has 3 components that analyze an API discussion in 3 different ways: text, code, and original. In the text component, CASE only considers the textual description; in the code component, CASE only considers the source code; in the original component, CASE considers the original content of an API discussion which might include textual description and source code. Next, for each component, since different terms (i.e., words) have different affinities to different categories, CASE caches a subset of terms which have the highest affinity scores to each category, and builds a classifier based on the cached terms. Finally, CASE combines all the 3 classifiers to achieve a better accuracy score. We evaluate the performance of CASE on 3 datasets which contain a total of 1,035 API discussions. The experiment results show that CASE achieves accuracy scores of 0.69, 0.77, and 0.96 for the 3 datasets respectively, which outperforms the state-of-the-art method proposed by Hou and Mo by 11%, 10%, and 2%, respectively.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"92 4 1","pages":"95-105"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89451295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Stefik, Stefan Hanenberg, Mark McKenney, A. Andrews, Srinivas Kalyan Yellanki, Susanna Siebert
{"title":"What is the foundation of evidence of human factors decisions in language design? an empirical study on programming language workshops","authors":"A. Stefik, Stefan Hanenberg, Mark McKenney, A. Andrews, Srinivas Kalyan Yellanki, Susanna Siebert","doi":"10.1145/2597008.2597154","DOIUrl":"https://doi.org/10.1145/2597008.2597154","url":null,"abstract":"In recent years, the programming language design community has engaged in rigorous debate on the role of empirical evidence in the design of general purpose programming languages. Some scholars contend that the language community has failed to embrace a form of evidence that is non-controversial in other disciplines (e.g., medicine, biology, psychology, sociology, physics, chemistry), while others argue that a science of language design is unrealistic. While the discussion will likely persist for some time, we begin here a systematic evaluation of the use of empirical evidence with human users, documenting, paper-by-paper, the evidence provided for human factors decisions, beginning with 359 papers from the workshops PPIG, Plateau, and ESP. This preliminary work provides the following contributions: an analysis of the 1) overall quantity and quality of empirical evidence used in the workshops, and of the 2) overall significant challenges to reliably coding academic papers. We hope that, once complete, this long-term research project will serve as a practical catalog designers can use when evaluating the impact of a language feature on human users.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"3 1","pages":"223-231"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76262294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prioritizing maintainability defects based on refactoring recommendations","authors":"Daniela Steidl, S. Eder","doi":"10.1145/2597008.2597805","DOIUrl":"https://doi.org/10.1145/2597008.2597805","url":null,"abstract":"As a measure of software quality, current static code analyses reveal thousands of quality defects on systems in brown-field development in practice. Currently, there exists no way to prioritize among a large number of quality defects and developers lack a structured approach to address the load of refactoring. Consequently, although static analyses are often used, they do not lead to actual quality improvement. Our approach recommends to remove quality defects, exemplary code clones and long methods, which are easy to refactor and, thus, provides developers a first starting point for quality improvement. With an empirical industrial Java case study, we evaluate the usefulness of the recommendation based on developers’ feedback. We further quantify which external factors influence the process of quality defect removal in industry software development.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"75 1","pages":"168-176"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79616953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Ghafari, C. Ghezzi, Andrea Mocci, Giordano Tamburrelli
{"title":"Mining unit tests for code recommendation","authors":"Mohammad Ghafari, C. Ghezzi, Andrea Mocci, Giordano Tamburrelli","doi":"10.1145/2597008.2597789","DOIUrl":"https://doi.org/10.1145/2597008.2597789","url":null,"abstract":"Developers spend a significant portion of their time understanding and learning the correct usage of the APIs of libraries they want to integrate in their projects. However, learning how to effectively use APIs is complex and time consuming. Code recommendation systems play a crucial role facilitating developers in this task by providing to them relevant examples while they code. This paper proposes a novel approach to code recommendation in which code examples are automatically obtained by mining and manipulating unit tests. In this paper we discuss the theoretical and practical implications that underpin this idea. The discussion leads to a series of fascinating research challenges that we organized in a research agenda.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"25 1","pages":"142-145"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89265504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vikrant S. Kaulgud, K. Annervaz, Janardan Misra, Gary Titus
{"title":"Comprehension support during knowledge transitions: learning from field","authors":"Vikrant S. Kaulgud, K. Annervaz, Janardan Misra, Gary Titus","doi":"10.1145/2597008.2597804","DOIUrl":"https://doi.org/10.1145/2597008.2597804","url":null,"abstract":"Knowledge Transition (KT) of legacy applications is a critical activity, often determining the quality of maintenance in the early stages of a maintenance life-cycle. We developed an integrated reverse engineering tool-suite that bootstraps the KT process by providing knowledge recipients insights to application structure, quality and functionality. The tool-suite is based on an in-depth study with KT practitioners and a comparative study of existing tools. We evaluated the benefits of the tool-suite during KT in real-life projects. In this talk, we report our learning from the study and evaluation phases.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"2 1","pages":"205-206"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89743573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carmine Vassallo, Sebastiano Panichella, M. D. Penta, G. Canfora
{"title":"CODES: mining source code descriptions from developers discussions","authors":"Carmine Vassallo, Sebastiano Panichella, M. D. Penta, G. Canfora","doi":"10.1145/2597008.2597799","DOIUrl":"https://doi.org/10.1145/2597008.2597799","url":null,"abstract":"Program comprehension is a crucial activity, preliminary to any software maintenance task. Such an activity can be difficult when the source code is not adequately documented, or the documentation is outdated. Differently from the many existing software re-documentation approaches, based on different kinds of code analysis, this paper describes CODES (mining sourCe cOde Descriptions from developErs diScussions), a tool which applies a \"social'' approach to software re-documentation. Specifically, CODES extracts candidate method documentation from StackOverflow discussions, and creates Javadoc descriptions from it. We evaluated CODES to mine Lucene and Hibernate method descriptions. The results indicate that CODES is able to extract descriptions for 20% and 28% of the Lucene and Hibernate methods with a precision of 84% and 91% respectively.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"38 1","pages":"106-109"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84227852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying and locating interference issues in PHP applications: the case of WordPress","authors":"L. Eshkevari, G. Antoniol, J. Cordy, M. D. Penta","doi":"10.1145/2597008.2597153","DOIUrl":"https://doi.org/10.1145/2597008.2597153","url":null,"abstract":"The large success of Content management Systems (CMS) such as WordPress is largely due to the rich ecosystem of themes and plugins developed around the CMS that allows users to easily build and customize complex Web applications featuring photo galleries, contact forms, and blog pages. However, the design of the CMS, the plugin-based architecture, and the implicit characteristics of the programming language used to develop them (often PHP), can cause interference or unwanted side effects between the resources declared and used by different plugins. This paper describes the problem of interference between plugins in CMS, specifically those developed using PHP, and outlines an approach combining static and dynamic analysis to detect and locate such interference. Results of a case study conducted over 10 WordPress plugins shows that the analysis can help to identify and locate plugin interference, and thus be used to enhance CMS quality assurance.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"141 1","pages":"157-167"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86387277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An information visualization feature model for supporting the selection of software visualizations","authors":"Renan Vasconcelos, Marcelo Schots, C. Werner","doi":"10.1145/2597008.2597796","DOIUrl":"https://doi.org/10.1145/2597008.2597796","url":null,"abstract":"Software development comprises the execution of a variety of tasks, such as bug discovery, finding reusable assets, dependency analysis etc. A better understanding of the task at hand and its surroundings can improve the development performance in general. Software visualizations can support such understanding by addressing different issues according to the necessity of stakeholders. However, knowing which visualizations better fit a given task in progress is not a trivial skill. In this sense, a feature model, intended for organizing the knowledge of a given domain and allowing the reuse of components, can support the identification, categorization and selection of information visualization elements. This work presents an ongoing domain analysis performed for building an information visualization feature model, whose goal is to support the process of choosing and building proper, suitable software visualizations.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"30 1","pages":"122-125"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86767408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nesrine Noughi, Marco Mori, L. Meurice, Anthony Cleve
{"title":"Understanding the database manipulation behavior of programs","authors":"Nesrine Noughi, Marco Mori, L. Meurice, Anthony Cleve","doi":"10.1145/2597008.2597790","DOIUrl":"https://doi.org/10.1145/2597008.2597790","url":null,"abstract":"Due to the lack of (up-do-date) documentation, software maintenance and evolution processes often necessitate the recovery of a sucient understanding of the software system, before the latter can be adapted to new or changing requirements. To address this problem, several program comprehension techniques have been proposed to support this preliminary phase of software maintenance and evolution. Nevertheless, those techniques generally fail in gaining a complete and accurate understanding in the case of modern data-intensive systems, which are characterized by complex, dynamic and continuous interactions between the application programs and their database. In particular, understanding the database manipulation behavior of a given program involves dierent levels of comprehension ranging from identifying to relating and interpreting the successive database access operations. In this paper, we present our early research achievements in the development of a tool-supported framework aiming to extract and understand the database manipulation behavior of data-intensive programs.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"32 1","pages":"64-67"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90241851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The MoJo family: a story about clustering evaluation (invited talk)","authors":"Zhihua Wen, Vassilios Tzerpos","doi":"10.1145/2597008.2602159","DOIUrl":"https://doi.org/10.1145/2597008.2602159","url":null,"abstract":"The need to decompose large, complex software systems into smaller, more manageable subsystems has been recognized for more than two decades. Many cluster analysis algorithms have been applied to the software domain, and several algorithms specializing in software clustering have been developed. This in turn has created the need to evaluate and compare clustering results. \u0000 This talk will present some background on the software clustering problem and its challenges, as well as the software clustering evaluation and its challenges. It will then discuss the MoJo family of measures with an emphasis on MoJoFM (originally presented at IWPC 2004).","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"29 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78898783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}