{"title":"A formal evaluation of DepDegree based on weyuker's properties","authors":"Dirk Beyer, Peter Häring","doi":"10.1145/2597008.2597794","DOIUrl":"https://doi.org/10.1145/2597008.2597794","url":null,"abstract":"Complexity of source code is an important characteristic that software engineers aim to quantify using static software measurement. Several measures used in practice as indicators for software complexity have theoretical flaws. In order to assess the quality of a software measure, Weyuker established a set of properties that an indicator for program-code complexity should satisfy. It is known that several well-established complexity indicators do not fulfill Weyuker's properties. As an ``early achievement'' in a larger project on evaluating software measures, we show that DepDegree, a measure for data-flow dependencies, satisfies all of Weyuker's properties.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"354 1","pages":"258-261"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76485094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hiroyuki Kirinuki, Yoshiki Higo, Keisuke Hotta, S. Kusumoto
{"title":"Hey! are you committing tangled changes?","authors":"Hiroyuki Kirinuki, Yoshiki Higo, Keisuke Hotta, S. Kusumoto","doi":"10.1145/2597008.2597798","DOIUrl":"https://doi.org/10.1145/2597008.2597798","url":null,"abstract":"Although there is a principle that states a commit should only include changes for a single task, it is not always respected by developers. This means that code repositories often include commits that contain tangled changes. The presence of such tangled changes hinders analyzing code repositories because most mining software repository (MSR) approaches are designed with the assumption that every commit includes only changes for a single task. In this paper, we propose a technique to inform developers that they are in the process of committing tangled changes. The proposed technique utilizes the changes included in the past commits to judge whether a given commit includes tangled changes. If it determines that the proposed commit may include tangled changes, it offers suggestions on how the tangled changes can be split into a set of untangled changes.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"61 1","pages":"262-265"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89356216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A semiautomated method for classifying program analysis rules into a quality model","authors":"Shrinath Gupta, Himanshu K. Singh","doi":"10.1145/2597008.2597808","DOIUrl":"https://doi.org/10.1145/2597008.2597808","url":null,"abstract":"Most of the software code quality assessment and monitoring methods uses Quality Model (QM) as an aid to capture quality requirements of the software. An important aspect concerning use of QM is classification of Program Analysis (PA) rules into QM according to their relevance to quality attributes such as maintainability, reliability etc. Currently such classification is performed manually by experts and most of the PA tools (such as FxCop for C#, FindBugs for Java, PC-Lint for C/C++) support hundreds of PA rules. Hence performing classification manually can be very effort intensive and time consuming and can lead to concerns like subjectivity and inconsistency. Hence we propose a light weight semiautomated method to expedite classification and make classification activity less effort intensive. Proposed classifier is based on natural language processing (NLP) techniques and uses a keyword matching algorithm. We have computed precision and recall for such a classifier. We have also shown results from applying technique on classifying rules from FxCop, PC-Lint, and FindBugs into the EMISQ QM. We believe that proposed approach will significantly help in reducing the time required to perform classification and hence also to incorporate newer PA tools and rules into QM based methods.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"14 1","pages":"266-270"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87624210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Grechanik, Collin McMillan, Tathagata Dasgupta, D. Poshyvanyk, Malcom Gethers
{"title":"Redacting sensitive information in software artifacts","authors":"M. Grechanik, Collin McMillan, Tathagata Dasgupta, D. Poshyvanyk, Malcom Gethers","doi":"10.1145/2597008.2597138","DOIUrl":"https://doi.org/10.1145/2597008.2597138","url":null,"abstract":"In the past decade, there have been many well-publicized cases of source code leaking from different well-known companies. These leaks pose a serious problem when the source code contains sensitive information encoded in its identifier names and comments. Unfortunately, redacting the sensitive information requires obfuscating the identifiers, which will quickly interfere with program comprehension. Program comprehension is key for programmers in understanding the source code, so sensitive information is often left unredacted. \u0000 To address this problem, we offer a novel approach for REdacting Sensitive Information in Software arTifacts (RESIST). RESIST finds and replaces sensitive words in software artifacts in such a way to reduce the impact on program comprehension. We evaluated RESIST experimentally using 57 professional programmers from over a dozen different organizations. Our evaluation shows that RESIST effectively redacts software artifacts, thereby making it difficult for participants to infer sensitive information, while maintaining a desired level of comprehension.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"88 2 1","pages":"314-325"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86509618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. McBurney, Cheng Liu, Collin McMillan, Tim Weninger
{"title":"Improving topic model source code summarization","authors":"P. McBurney, Cheng Liu, Collin McMillan, Tim Weninger","doi":"10.1145/2597008.2597793","DOIUrl":"https://doi.org/10.1145/2597008.2597793","url":null,"abstract":"In this paper, we present an emerging source code summarization technique that uses topic modeling to select keywords and topics as summaries for source code. Our approach organizes the topics in source code into a hierarchy, with more general topics near the top of the hierarchy. In this way, we present the software's highest-level functionality first, before lower-level details. This is an advantage over previous approaches based on topic models, that only present groups of related keywords without a hierarchy. We conducted a preliminary user study that found our approach selects keywords and topics that the participants found to be accurate in a majority of cases.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"26 29","pages":"291-294"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91534710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Version history, similar report, and structure: putting them together for improved bug localization","authors":"Shaowei Wang, D. Lo","doi":"10.1145/2597008.2597148","DOIUrl":"https://doi.org/10.1145/2597008.2597148","url":null,"abstract":"During the evolution of a software system, a large number of bug reports are submitted. Locating the source code files that need to be fixed to resolve the bugs is a challenging problem. Thus, there is a need for a technique that can automatically figure out these buggy files. A number of bug localization solutions that take in a bug report and output a ranked list of files sorted based on their likelihood to be buggy have been proposed in the literature. However, the accuracy of these tools still need to be improved. \u0000 In this paper, to address this need, we propose AmaLgam, a new method for locating relevant buggy files that puts together version history, similar reports, and structure. To do this, AmaLgam integrates a bug prediction technique used in Google which analyzes version history, with a bug localization technique named BugLocator which analyzes similar reports from bug report system, and the state-of-the-art bug localization technique BLUiR which considers structure. We perform a large-scale experiment on four open source projects, namely AspectJ, Eclipse, SWT and ZXing to localize more than 3,000 bugs. Compared with a history-aware bug localization solution of Sisman and Kak, our approach achieves a 46.1% improvement in terms of mean average precision (MAP). Compared with BugLocator, our approach achieves a 24.4% improvement in terms of MAP. Compared with BLUiR, our approach achieves a 16.4% improvement in terms of MAP.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"19 1","pages":"53-63"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76038208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenzhou Tian, Q. Zheng, Ting Liu, Ming Fan, Xiaodong Zhang, Z. Yang
{"title":"Plagiarism detection for multithreaded software based on thread-aware software birthmarks","authors":"Zhenzhou Tian, Q. Zheng, Ting Liu, Ming Fan, Xiaodong Zhang, Z. Yang","doi":"10.1145/2597008.2597143","DOIUrl":"https://doi.org/10.1145/2597008.2597143","url":null,"abstract":"The availability of inexpensive multicore hardware presents a turning point in software development. In order to benefit from the continued exponential throughput advances in new processors, the software applications must be multithreaded programs. As multithreaded programs become increasingly popular, plagiarism of multithreaded programs starts to plague the software industry. Although there has been tremendous progress on software plagiarism detection technology, existing dynamic approaches remain optimized for sequential programs and cannot be applied to multithreaded programs without significant redesign. This paper fills the gap by presenting two dynamic birthmark based approaches. The first approach extracts key instructions while the second approach extracts system calls. Both approaches consider the effect of thread scheduling on computing software birthmarks. We have implemented a prototype based on the Pin instrumentation framework. Our empirical study shows that the proposed approaches can effectively detect plagiarism of multithread programs and exhibit strong resilience to various semantic-preserving code obfuscations.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"124 1","pages":"304-313"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76682244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A code obfuscation framework using code clones","authors":"Aniket Kulkarni, Ravindra Metta","doi":"10.1145/2597008.2597807","DOIUrl":"https://doi.org/10.1145/2597008.2597807","url":null,"abstract":"IT industry loses tens of billions of dollars annually from security attacks such as malicious reverse engineering. To protect sensitive parts of software from such attacks, we designed a code obfuscation scheme based on nontrivial code clones. While implementing this scheme, we realized that currently there is no framework to assist implementation of such advanced obfuscation techniques. Therefore, we have developed a framework to support code obfuscation using code clones. We could successfully implement our obfuscation technique using this framework in Java. In this paper, we present our framework and illustrate it with an example.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"426 1","pages":"295-299"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77344443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Amalgamating source code authors, maintainers, and change proneness to triage change requests","authors":"Kamal Hossen, H. Kagdi, D. Poshyvanyk","doi":"10.1145/2597008.2597147","DOIUrl":"https://doi.org/10.1145/2597008.2597147","url":null,"abstract":"The paper presents an approach, namely iMacPro, to recommend developers who are most likely to implement incoming change requests. iMacPro amalgamates the textual similarity between the given change request and source code, change proneness information, authors, and maintainers of a software system. Latent Semantic Indexing (LSI) and a lightweight analysis of source code, and its commits from the software repository, are used. The basic premise of iMacPro is that the authors and maintainers of the relevant source code, which is change prone, to a given change request are most likely to best assist with its resolution. iMacPro unifies these sources in a unique way to perform its task, which was not investigated and reported in the literature previously. \u0000 An empirical study on three open source systems, ArgoUML, JabRef, and jEdit , was conducted to assess the effectiveness of iMacPro. A number of change requests from these systems were used in the evaluated benchmark. Recall values for top one, five, and ten recommended developers are reported. Furthermore, a comparative study with a previous approach that uses the source-code authorship information for developer recommendation was performed. Results show that iMacPro could provide recall gains from 30% to 180% over its subjected competitor with statistical significance.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"14 1","pages":"130-141"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85188871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Innovating in india: designing for constraint, computing for inclusion (keynote)","authors":"Edward Cutrell","doi":"10.1145/2597008.2602160","DOIUrl":"https://doi.org/10.1145/2597008.2602160","url":null,"abstract":"A fundamental tenet of user-centered design is that the needs, wants, limitations, and contexts of end users are central to the process of creating products and services that can be used and understood by the people who will use them. Most of the time these end users aren’t all that different from the people designing the technology. But as the differences increase between designers and the people they’re designing for, understanding and empathizing with users becomes harder and even more important. As we build software for people and communities with vastly diverse backgrounds, cultures, languages, and education, we need to stretch our ideas of what users want and need and how best to serve them. \u0000 The Technology for Emerging Markets (TEM) group at Microsoft Research India seeks to address the needs and aspirations of people in the developing world who are just beginning to use computing technologies and services as well as those for whom access to computing still remains largely out of reach. Much of this work can be described as designing for constraint: constraints in education, in infrastructure, in financial resources, in languages and in many other areas. In this talk, I will describe some work from our group that explores how we have tried to manage these constraints to create software and systems for people and communities often overlooked by technologists.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"28 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78109475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}