Diego Marcilio, R. Bonifácio, Eduardo Monteiro, E. Canedo, W. Luz, G. Pinto
{"title":"Are Static Analysis Violations Really Fixed? A Closer Look at Realistic Usage of SonarQube","authors":"Diego Marcilio, R. Bonifácio, Eduardo Monteiro, E. Canedo, W. Luz, G. Pinto","doi":"10.1109/ICPC.2019.00040","DOIUrl":"https://doi.org/10.1109/ICPC.2019.00040","url":null,"abstract":"The use of automatic static analysis tools (ASATs) has gained increasing attention in the last few years. Even though available research have already explored ASATs issues and how they are fixed, these studies rely on revisions of the software, instead of mining real usage of these tools and real issue reports. In this paper we contribute with a comprehensive, multi-method study about the usage of SonarQube (a popular static analysis tool), mining 421,976 issues from 246 projects in four different instance of SonarQube: two hosted in open-source communities (Eclipse and Apache) and two hosted in Brazilian government institutions (Brazilian Court of Account (TCU) and Brazilian Federal Police (PF)). We first surveyed team leaders of the analyzed projects and found that they mostly consider ASATs warning messages as relevant for overall software improvement. Second, we found that both Eclipse and TCU employ highly customized instance of SonarQube, with more than one thousand distinct checkers–though just a subset of these checkers actually led to issues' reports. Surprisingly, we found a low resolution rate per project in all organizations–on average, 13% of the issues have been solved in the systems. We conjecture that just a subset of the checkers reveal real design and coding flaws, and this might artificially increase the technical debt of the systems. Nevertheless, considering all systems, there is a central tendency(median) of fixing issues after 18.99 days they had been reported, faster than the period for fixing bugs as reported in previous studies.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"28 1","pages":"209-219"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89273638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Yu, Wing Lam, Long Chen, Ge Li, Tao Xie, Qianxiang Wang
{"title":"Neural Detection of Semantic Code Clones Via Tree-Based Convolution","authors":"Hao Yu, Wing Lam, Long Chen, Ge Li, Tao Xie, Qianxiang Wang","doi":"10.1109/ICPC.2019.00021","DOIUrl":"https://doi.org/10.1109/ICPC.2019.00021","url":null,"abstract":"Code clones are similar code fragments that share the same semantics but may differ syntactically to various degrees. Detecting code clones helps reduce the cost of software maintenance and prevent faults. Various approaches of detecting code clones have been proposed over the last two decades, but few of them can detect semantic clones, i.e., code clones with dissimilar syntax. Recent research has attempted to adopt deep learning for detecting code clones, such as using tree-based LSTM over Abstract Syntax Tree (AST). However, it does not fully leverage the structural information of code fragments, thereby limiting its clone-detection capability. To fully unleash the power of deep learning for detecting code clones, we propose a new approach that uses tree-based convolution to detect semantic clones, by capturing both the structural information of a code fragment from its AST and lexical information from code tokens. Additionally, our approach addresses the limitation that source code has an unlimited vocabulary of tokens and models, and thus exploiting lexical information from code tokens is often ineffective when dealing with unseen tokens. Particularly, we propose a new embedding technique called position-aware character embedding (PACE), which essentially treats any token as a position-weighted combination of character one-hot embeddings. Our experimental results show that our approach substantially outperforms an existing state-of-the-art approach with an increase of 0.42 and 0.15 in F1-score on two popular code-clone benchmarks (OJClone and BigCloneBench), respectively, while being more computationally efficient. Our experimental results also show that PACE enables our approach to be substantially more effective when code clones contain unseen tokens.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"10 1","pages":"70-80"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78870866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Fakhoury, Devjeet Roy, Sk Adnan Hassan, V. Arnaoudova
{"title":"Improving Source Code Readability: Theory and Practice","authors":"Sarah Fakhoury, Devjeet Roy, Sk Adnan Hassan, V. Arnaoudova","doi":"10.1109/ICPC.2019.00014","DOIUrl":"https://doi.org/10.1109/ICPC.2019.00014","url":null,"abstract":"There are several widely accepted metrics to measure code quality that are currently being used in both research and practice to detect code smells and to find opportunities for code improvement. Although these metrics have been proposed as a proxy of code quality, recent research suggests that more often than not, state-of-the-art code quality metrics do not successfully capture quality improvements in the source code as perceived by developers. More specifically, results show that there may be inconsistencies between, on the one hand, the results from metrics for cohesion, coupling, complexity, and readability, and, on the other hand, the interpretation of these metrics in practice. As code improvement tools rely on these metrics, there is a clear need to identify and resolve the aforementioned inconsistencies. This will allow for the creation of tools that are more aligned with developers' perception of quality, and can more effectively help source code improvement efforts. In this study, we investigate 548 instances of source code readability improvements, as explicitly stated by internal developers in practice, from 63 engineered software projects. We show that current readability models fail to capture readability improvements. We also show that tools to calculate additional metrics, to detect refactorings, and to detect style problems are able to capture characteristics that are specific to readability changes and thus should be considered by future readability models.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"66 1","pages":"2-12"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83852212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Large-Scale Empirical Study on Code-Comment Inconsistencies","authors":"Fengcai Wen, Csaba Nagy, G. Bavota, Michele Lanza","doi":"10.1109/ICPC.2019.00019","DOIUrl":"https://doi.org/10.1109/ICPC.2019.00019","url":null,"abstract":"Code comments are a primary means to document source code. Keeping comments up-to-date during code change activities requires substantial time and attention. For this reason, researchers have proposed methods to detect code-comment inconsistencies (i.e., comments that are not kept in sync with the code they document) and studies have been conducted to investigate this phenomenon. However, these studies were performed at a small scale, relying on quantitative analysis, thus limiting the empirical knowledge about code-comment inconsistencies. We present the largest study at date investigating how code and comments co-evolve. The study has been performed by mining 1.3 Billion AST-level changes from the complete history of 1,500 systems. Moreover, we manually analyzed 500 commits to define a taxonomy of code-comment inconsistencies fixed by developers. Our analysis discloses the extent to which different types of code changes (e.g., change of selection statements) trigger updates to the related comments, identifying cases in which code-comment inconsistencies are more likely to be introduced. The defined taxonomy categorizes the types of inconsistencies fixed by developers. Our results can guide the development of tools aimed at detecting and fixing code-comment inconsistencies.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"54 1","pages":"53-64"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91171471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Steering Committee","authors":"Icit, Dipak Misra, S. Mohanty, M. Ganapathiraju","doi":"10.1109/icpc.2019.00012","DOIUrl":"https://doi.org/10.1109/icpc.2019.00012","url":null,"abstract":"Dipak Misra (Chair), Xavier University, India Saraju P. Mohanty (Vice-chair), University of North Texas, USA, Durgamadhab Misra, New Jersey Institute of Technology, USA Gautam Das, University of Texas at Arlington, USA Madhavi K. Ganapathiraju, University of Pittsburgh, USA Niranjan Ray, KIIT University, India Siba Prasad Misra, Odisha IT Society, India Sudarsan Padhy, SIT, India Vincent Oria, New Jersey Institute of Technology, USA","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83374728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Norman Peitek, S. Apel, A. Brechmann, Chris Parnin, J. Siegmund
{"title":"CodersMUSE: Multi-Modal Data Exploration of Program-Comprehension Experiments","authors":"Norman Peitek, S. Apel, A. Brechmann, Chris Parnin, J. Siegmund","doi":"10.1109/ICPC.2019.00027","DOIUrl":"https://doi.org/10.1109/ICPC.2019.00027","url":null,"abstract":"Program comprehension is a central cognitive process in programming. It has been in the focus of researchers for decades, but is still not thoroughly unraveled. Multi-modal psycho-physiological and neurobiological measurement methods have proved successful to gain a more holistic understanding of program comprehension. However, there is no proper tool support that lets researchers explore synchronized, conjoint multi-modal data, specifically designed for the needs in program-comprehension research. In this paper, we present CodersMUSE, a prototype implementation that aims to satisfy this crucial need.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"5 1","pages":"126-129"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88761650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Title Page iii","authors":"","doi":"10.1109/icpc.2019.00002","DOIUrl":"https://doi.org/10.1109/icpc.2019.00002","url":null,"abstract":"","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"33 6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76911230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongliang Liang, Yini Zhang, Yue Yu, Zhuosi Xie, Lin Jiang
{"title":"Sequence Coverage Directed Greybox Fuzzing","authors":"Hongliang Liang, Yini Zhang, Yue Yu, Zhuosi Xie, Lin Jiang","doi":"10.1109/ICPC.2019.00044","DOIUrl":"https://doi.org/10.1109/ICPC.2019.00044","url":null,"abstract":"Existing directed fuzzers are not efficient enough. Directed symbolic-execution-based whitebox fuzzers, e.g. BugRedux, spend lots of time on heavyweight program analysis and constraints solving at runtime. Directed greybox fuzzers, such as AFLGo, perform well at runtime, but considerable calculation during instrumentation phase hinders the overall performance. In this paper, we propose Sequence-coverage Directed Fuzzing (SCDF), a lightweight directed fuzzing technique which explores towards the user-specified program statements efficiently. Given a set of target statement sequences of a program, SCDF aims to generate inputs that can reach the statements in each sequence in order and trigger bugs in the program. Moreover, we present a novel energy schedule algorithm, which adjusts on demand a seed's energy according to its ability of covering the given statement sequences calculated on demand. We implement the technique in a tool LOLLY in order to achieve efficiency both at instrumentation time and at runtime. Experiments on several real-world software projects demonstrate that LOLLY outperforms two well-established tools on efficiency and effectiveness, i.e., AFLGo–a directed greybox fuzzer and BugRedux–a directed symbolic-execution-based whitebox fuzzer.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"13 1","pages":"249-259"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74503801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tools Track Program Committee","authors":"","doi":"10.1109/icpc.2019.00010","DOIUrl":"https://doi.org/10.1109/icpc.2019.00010","url":null,"abstract":"","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89481916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Replications Track Program Committee","authors":"","doi":"10.1109/icpc.2019.00008","DOIUrl":"https://doi.org/10.1109/icpc.2019.00008","url":null,"abstract":"","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75253044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}