{"title":"Exploring API Embedding for API Usages and Applications","authors":"Trong Duc Nguyen, A. Nguyen, H. Phan, T. Nguyen","doi":"10.1109/ICSE.2017.47","DOIUrl":"https://doi.org/10.1109/ICSE.2017.47","url":null,"abstract":"Word2Vec is a class of neural network models that as being trainedfrom a large corpus of texts, they can produce for each unique word acorresponding vector in a continuous space in which linguisticcontexts of words can be observed. In this work, we study thecharacteristics of Word2Vec vectors, called API2VEC or API embeddings, for the API elements within the API sequences in source code. Ourempirical study shows that the close proximity of the API2VEC vectorsfor API elements reflects the similar usage contexts containing thesurrounding APIs of those API elements. Moreover, API2VEC can captureseveral similar semantic relations between API elements in API usagesvia vector offsets. We demonstrate the usefulness of API2VEC vectorsfor API elements in three applications. First, we build a tool thatmines the pairs of API elements that share the same usage relationsamong them. The other applications are in the code migrationdomain. We develop API2API, a tool to automatically learn the APImappings between Java and C# using a characteristic of the API2VECvectors for API elements in the two languages: semantic relationsamong API elements in their usages are observed in the two vectorspaces for the two languages as similar geometric arrangements amongtheir API2VEC vectors. Our empirical evaluation shows that API2APIrelatively improves 22.6% and 40.1% top-1 and top-5 accuracy over astate-of-the-art mining approach for API mappings. Finally, as anotherapplication in code migration, we are able to migrate equivalent APIusages from Java to C# with up to 90.6% recall and 87.2% precision.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"36 1","pages":"438-449"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79224728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Symbolic Model Extraction for Web Application Verification","authors":"Ivan Bocic, T. Bultan","doi":"10.1109/ICSE.2017.72","DOIUrl":"https://doi.org/10.1109/ICSE.2017.72","url":null,"abstract":"Modern web applications use complex data models and access control rules which lead to data integrity and access control errors. One approach to find such errors is to use formal verification techniques. However, as a first step, most formal verification techniques require extraction of a formal model which is a difficult problem in itself due to dynamic features of modern languages, and it is typically done either manually, or using ad hoc techniques. In this paper, we present a technique called symbolic model extraction for extracting formal data models from web applications. The key ideas of symbolic model extraction are 1) to use the source language interpreter for model extraction, which enables us to handle dynamic features of the language, 2) to use code instrumentation so that execution of each instrumented piece of code returns the formal model that corresponds to that piece of code, 3) to instrument the code dynamically so that the models of methods that are created at runtime can also be extracted, and 4) to execute both sides of branches during instrumented execution so that all program behaviors can be covered in a single instrumented execution. We implemented the symbolic model extraction technique for the Rails framework and used it to extract data and access control models from web applications. Our experiments demonstrate that symbolic model extraction is scalable and extracts formal models that are precise enough to find bugs in real-world applications without reporting too many false positives.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"2 1","pages":"724-734"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73718881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Syntactic and Semantic Differencing for Combinatorial Models of Test Designs","authors":"Rachel Tzoref, S. Maoz","doi":"10.1109/ICSE.2017.63","DOIUrl":"https://doi.org/10.1109/ICSE.2017.63","url":null,"abstract":"Combinatorial test design (CTD) is an effective test design technique, considered to be a testing best practice. CTD provides automatic test plan generation, but it requires a manual definition of the test space in the form of a combinatorial model. As the system under test evolves, e.g., due to iterative development processes and bug fixing, so does the test space, and thus, in the context of CTD, evolution translates into frequent manual model definition updates. Manually reasoning about the differences between versions of real-world models following such updates is infeasible due to their complexity and size. Moreover, representing the differences is challenging. In this work, we propose a first syntactic and semantic differencing technique for combinatorial models of test designs. We define a concise and canonical representation for differences between two models, and suggest a scalable algorithm for automatically computing and presenting it. We use our differencing technique to analyze the evolution of 42 real-world industrial models, demonstrating its applicability and scalability. Further, a user study with 16 CTD practitioners shows that comprehension of differences between real-world combinatorial model versions is challenging and that our differencing tool significantly improves the performance of less experienced practitioners. The analysis and user study provide evidence for the potential usefulness of our differencing approach. Our work advances the state-of-the-art in CTD with better capabilities for change comprehension and management.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"31 1","pages":"621-631"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74053260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Siegfried Rasthofer, Steven Arzt, S. Triller, Michael Pradel
{"title":"Making Malory Behave Maliciously: Targeted Fuzzing of Android Execution Environments","authors":"Siegfried Rasthofer, Steven Arzt, S. Triller, Michael Pradel","doi":"10.1109/ICSE.2017.35","DOIUrl":"https://doi.org/10.1109/ICSE.2017.35","url":null,"abstract":"Android applications, or apps, provide useful features to end-users, but many apps also contain malicious behavior. Modern malware makes understanding such behavior challenging by behaving maliciously only under particular conditions. For example, a malware app may check whether it runs on a real device and not an emulator, in a particular country, and alongside a specific target app, such as a vulnerable banking app. To observe the malicious behavior, a security analyst must find out and emulate all these app-specific constraints. This paper presents FuzzDroid, a framework for automatically generating an Android execution environment where an app exposes its malicious behavior. The key idea is to combine an extensible set of static and dynamic analyses through a search-based algorithm that steers the app toward a configurable target location. On recent malware, the approach reaches the target location in 75% of the apps. In total, we reach 240 code locations within an average time of only one minute. To reach these code locations, FuzzDroid generates 106 different environments, too many for a human analyst to create manually.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"17 1","pages":"300-311"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78108435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"To Type or Not to Type: Quantifying Detectable Bugs in JavaScript","authors":"Zheng Gao, C. Bird, Earl T. Barr","doi":"10.1109/ICSE.2017.75","DOIUrl":"https://doi.org/10.1109/ICSE.2017.75","url":null,"abstract":"JavaScript is growing explosively and is now used in large mature projects even outside the web domain. JavaScript is also a dynamically typed language for which static type systems, notably Facebook's Flow and Microsoft's TypeScript, have been written. What benefits do these static type systems provide? Leveraging JavaScript project histories, we select a fixed bug and check out the code just prior to the fix. We manually add type annotations to the buggy code and test whether Flow and TypeScript report an error on the buggy code, thereby possibly prompting a developer to fix the bug before its public release. We then report the proportion of bugs on which these type systems reported an error. Evaluating static type systems against public bugs, which have survived testing and review, is conservative: it understates their effectiveness at detecting bugs during private development, not to mention their other benefits such as facilitating code search/completion and serving as documentation. Despite this uneven playing field, our central finding is that both static type systems find an important percentage of public bugs: both Flow 0.30 and TypeScript 2.0 successfully detect 15%!.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"23 1","pages":"758-769"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78698597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LibD: Scalable and Precise Third-Party Library Detection in Android Markets","authors":"Menghao Li, Wei Wang, Pei Wang, Shuai Wang, Dinghao Wu, Jian Liu, Rui Xue, Wei Huo","doi":"10.1109/ICSE.2017.38","DOIUrl":"https://doi.org/10.1109/ICSE.2017.38","url":null,"abstract":"With the thriving of the mobile app markets, third-party libraries are pervasively integrated in the Android applications. Third-party libraries provide functionality such as advertisements, location services, and social networking services, making multi-functional app development much more productive. However, the spread of vulnerable or harmful third-party libraries may also hurt the entire mobile ecosystem, leading to various security problems. The Android platform suffers severely from such problems due to the way its ecosystem is constructed and maintained. Therefore, third-party Android library identification has emerged as an important problem which is the basis of many security applications such as repackaging detection and malware analysis. According to our investigation, existing work on Android library detection still requires improvement in many aspects, including accuracy and obfuscation resilience. In response to these limitations, we propose a novel approach to identifying third-party Android libraries. Our method utilizes the internal code dependencies of an Android app to detect and classify library candidates. Different from most previous methods which classify detected library candidates based on similarity comparison, our method is based on feature hashing and can better handle code whose package and method names are obfuscated. Based on this approach, we have developed a prototypical tool called LibD and evaluated it with an update-to-date and large-scale dataset. Our experimental results on 1,427,395 apps show that compared to existing tools, LibD can better handle multi-package third-party libraries in the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"60 1","pages":"335-346"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85993925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Challenges for Static Analysis of Java Reflection - Literature Review and Empirical Study","authors":"D. Landman, Alexander Serebrenik, J. Vinju","doi":"10.1109/ICSE.2017.53","DOIUrl":"https://doi.org/10.1109/ICSE.2017.53","url":null,"abstract":"The behavior of software that uses the Java Reflection API is fundamentally hard to predict by analyzing code. Only recent static analysis approaches can resolve reflection under unsound yet pragmatic assumptions. We survey what approaches exist and what their limitations are. We then analyze how real-world Java code uses the Reflection API, and how many Java projects contain code challenging state-of-the-art static analysis. Using a systematic literature review we collected and categorized all known methods of statically approximating reflective Java code. Next to this we constructed a representative corpus of Java systems and collected descriptive statistics of the usage of the Reflection API. We then applied an analysis on the abstract syntax trees of all source code to count code idioms which go beyond the limitation boundaries of static analysis approaches. The resulting data answers the research questions. The corpus, the tool and the results are openly available. We conclude that the need for unsound assumptions to resolve reflection is widely supported. In our corpus, reflection can not be ignored for 78% of the projects. Common challenges for analysis tools such as non-exceptional exceptions, programmatic filtering meta objects, semantics of collections, and dynamic proxies, widely occur in the corpus. For Java software engineers prioritizing on robustness, we list tactics to obtain more easy to analyze reflection code, and for static analysis tool builders we provide a list of opportunities to have significant impact on real Java code.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"12 1","pages":"507-518"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91527610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Özgür Kafali, Jasmine Jones, Megan Petruso, L. Williams, Munindar P. Singh
{"title":"How Good Is a Security Policy against Real Breaches? A HIPAA Case Study","authors":"Özgür Kafali, Jasmine Jones, Megan Petruso, L. Williams, Munindar P. Singh","doi":"10.1109/ICSE.2017.55","DOIUrl":"https://doi.org/10.1109/ICSE.2017.55","url":null,"abstract":"Policy design is an important part of software development. As security breaches increase in variety, designing a security policy that addresses all potential breaches becomes a nontrivial task. A complete security policy would specify rules to prevent breaches. Systematically determining which, if any, policy clause has been violated by a reported breach is a means for identifying gaps in a policy. Our research goal is to help analysts measure the gaps between security policies and reported breaches by developing a systematic process based on semantic reasoning. We propose SEMAVER, a framework for determining coverage of breaches by policies via comparison of individual policy clauses and breach descriptions. We represent a security policy as a set of norms. Norms (commitments, authorizations, and prohibitions) describe expected behaviors of users, and formalize who is accountable to whom and for what. A breach corresponds to a norm violation. We develop a semantic similarity metric for pairwise comparison between the norm that represents a policy clause and the norm that has been violated by a reported breach. We use the US Health Insurance Portability and Accountability Act (HIPAA) as a case study. Our investigation of a subset of the breaches reported by the US Department of Health and Human Services (HHS) reveals the gaps between HIPAA and reported breaches, leading to a coverage of 65%. Additionally, our classification of the 1,577 HHS breaches shows that 44% of the breaches are accidental misuses and 56% are malicious misuses. We find that HIPAA's gaps regarding accidental misuses are significantly larger than its gaps regarding malicious misuses.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"3 1","pages":"530-540"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87276253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabio Palomba, P. Salza, Adelina Ciurumelea, Sebastiano Panichella, H. Gall, F. Ferrucci, A. D. Lucia
{"title":"Recommending and Localizing Change Requests for Mobile Apps Based on User Reviews","authors":"Fabio Palomba, P. Salza, Adelina Ciurumelea, Sebastiano Panichella, H. Gall, F. Ferrucci, A. D. Lucia","doi":"10.1109/ICSE.2017.18","DOIUrl":"https://doi.org/10.1109/ICSE.2017.18","url":null,"abstract":"Researchers have proposed several approaches to extract information from user reviews useful for maintaining and evolving mobile apps. However, most of them just perform automatic classification of user reviews according to specific keywords (e.g., bugs, features). Moreover, they do not provide any support for linking user feedback to the source code components to be changed, thus requiring a manual, time-consuming, and error-prone task. In this paper, we introduce ChangeAdvisor, a novel approach that analyzes the structure, semantics, and sentiments of sentences contained in user reviews to extract useful (user) feedback from maintenance perspectives and recommend to developers changes to software artifacts. It relies on natural language processing and clustering algorithms to group user reviews around similar user needs and suggestions for change. Then, it involves textual based heuristics to determine the code artifacts that need to be maintained according to the recommended software changes. The quantitative and qualitative studies carried out on 44,683 user reviews of 10 open source mobile apps and their original developers showed a high accuracy of ChangeAdvisor in (i) clustering similar user change requests and (ii) identifying the code components impacted by the suggested changes. Moreover, the obtained results show that ChangeAdvisor is more accurate than a baseline approach for linking user feedback clusters to the source code in terms of both precision (+47%) and recall (+38%).","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"76 1","pages":"106-117"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86501667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junjie Chen, Y. Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, Bing Xie
{"title":"Learning to Prioritize Test Programs for Compiler Testing","authors":"Junjie Chen, Y. Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, Bing Xie","doi":"10.1109/ICSE.2017.70","DOIUrl":"https://doi.org/10.1109/ICSE.2017.70","url":null,"abstract":"Compiler testing is a crucial way of guaranteeing the reliability of compilers (and software systems in general). Many techniques have been proposed to facilitate automated compiler testing. These techniques rely on a large number of test programs (which are test inputs of compilers) generated by some test-generation tools (e.g., CSmith). However, these compiler testing techniques have serious efficiency problems as they usually take a long period of time to find compiler bugs. To accelerate compiler testing, it is desirable to prioritize the generated test programs so that the test programs that are more likely to trigger compiler bugs are executed earlier. In this paper, we propose the idea of learning to test, which learns the characteristics of bug-revealing test programs from previous test programs that triggered bugs. Based on the idea of learning to test, we propose LET, an approach to prioritizing test programs for compiler testing acceleration. LET consists of a learning process and a scheduling process. In the learning process, LET identifies a set of features of test programs, trains a capability model to predict the probability of a new test program for triggering compiler bugs and a time model to predict the execution time of a test program. In the scheduling process, LET prioritizes new test programs according to their bug-revealing probabilities in unit time, which is calculated based on the two trained models. Our extensive experiments show that LET significantly accelerates compiler testing. In particular, LET reduces more than 50% of the testing time in 24.64% of the cases, and reduces between 25% and 50% of the testing time in 36.23% of the cases.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"700-711"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83182909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}