Maxime Cordy, S. Muller, Mike Papadakis, Yves Le Traon
{"title":"Search-based test and improvement of machine-learning-based anomaly detection systems","authors":"Maxime Cordy, S. Muller, Mike Papadakis, Yves Le Traon","doi":"10.1145/3293882.3330580","DOIUrl":"https://doi.org/10.1145/3293882.3330580","url":null,"abstract":"Machine-learning-based anomaly detection systems can be vulnerable to new kinds of deceptions, known as training attacks, which exploit the live learning mechanism of these systems by progressively injecting small portions of abnormal data. The injected data seamlessly swift the learned states to a point where harmful data can pass unnoticed. We focus on the systematic testing of these attacks in the context of intrusion detection systems (IDS). We propose a search-based approach to test IDS by making training attacks. Going a step further, we also propose searching for countermeasures, learning from the successful attacks and thereby increasing the resilience of the tested IDS. We evaluate our approach on a denial-of-service attack detection scenario and a dataset recording the network traffic of a real-world system. Our experiments show that our search-based attack scheme generates successful attacks bypassing the current state-of-the-art defences. We also show that our approach is capable of generating attack patterns for all configuration states of the studied IDS and that it is capable of providing appropriate countermeasures. By co-evolving our attack and defence mechanisms we succeeded at improving the defence of the IDS under test by making it resilient to 49 out of 50 independently generated attacks.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89792955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the correctness of GPU programs","authors":"Chao Peng","doi":"10.1145/3293882.3338989","DOIUrl":"https://doi.org/10.1145/3293882.3338989","url":null,"abstract":"Testing is an important and challenging part of software development and its effectiveness depends on the quality of test cases. However, there exists no means of measuring quality of tests developed for GPU programs and as a result, no test case generation techniques for GPU programs aiming at high test effectiveness. Existing criteria for sequential and multithreaded CPU programs cannot be directly applied to GPU programs as GPU follows a completely different memory and execution model. We surveyed existing work on GPU program verification and bug fixes of open source GPU programs. Based on our findings, we define barrier, branch and loop coverage criteria and propose a set of mutation operators to measure fault finding capabilities of test cases. CLTestCheck, a framework for measuring quality of tests developed for GPU programs by code coverage analysis, fault seeding and work-group schedule amplification has been developed and evaluated using industry standard benchmarks. Experiments show that the framework is able to automatically measure test effectiveness and reveal unusual behaviours. Our planned work includes data flow coverage adopted for GPU programs to probe the underlying cause of unusual kernel behaviours and a more comprehensive work-group scheduler. We also plan to design and develop an automatic test case generator aiming at generating high quality test suites for GPU programs.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"71 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76776811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuying Li, Rui Hao, Yang Feng, James A. Jones, Xiaofang Zhang, Zhenyu Chen
{"title":"CTRAS: a tool for aggregating and summarizing crowdsourced test reports","authors":"Yuying Li, Rui Hao, Yang Feng, James A. Jones, Xiaofang Zhang, Zhenyu Chen","doi":"10.1145/3293882.3339004","DOIUrl":"https://doi.org/10.1145/3293882.3339004","url":null,"abstract":"In this paper, we present CTRAS, a tool for automatically aggregating and summarizing duplicate crowdsourced test reports on the fly. CTRAS can automatically detect duplicates based on both textual information and the screenshots, and further aggregates and summarizes the duplicate test reports. CTRAS provides end users with a comprehensive and comprehensible understanding of all duplicates by identifying the main topics across the group of aggregated test reports and highlighting supplementary topics that are mentioned in subgroups of test reports. Also, it provides the classic tool of issue tracking systems, such as the project-report dashboard and keyword searching, and automates their classic functionalities, such as bug triaging and best fixer recommendation, to assist end users in managing and diagnosing test reports. Video: https://youtu.be/PNP10gKIPFs","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"43 8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77721027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pingfan Kong, Li Li, Jun Gao, Tegawendé F. Bissyandé, Jacques Klein
{"title":"Mining Android crash fixes in the absence of issue- and change-tracking systems","authors":"Pingfan Kong, Li Li, Jun Gao, Tegawendé F. Bissyandé, Jacques Klein","doi":"10.1145/3293882.3330572","DOIUrl":"https://doi.org/10.1145/3293882.3330572","url":null,"abstract":"Android apps are prone to crash. This often arises from the misuse of Android framework APIs, making it harder to debug since official Android documentation does not discuss thoroughly potential exceptions.Recently, the program repair community has also started to investigate the possibility to fix crashes automatically. Current results, however, apply to limited example cases. In both scenarios of repair, the main issue is the need for more example data to drive the fix processes due to the high cost in time and effort needed to collect and identify fix examples. We propose in this work a scalable approach, CraftDroid, to mine crash fixes by leveraging a set of 28 thousand carefully reconstructed app lineages from app markets, without the need for the app source code or issue reports. We developed a replicative testing approach that locates fixes among app versions which output different runtime logs with the exact same test inputs. Overall, we have mined 104 relevant crash fixes, further abstracted 17 fine-grained fix templates that are demonstrated to be effective for patching crashed apks. Finally, we release ReCBench, a benchmark consisting of 200 crashed apks and the crash replication scripts, which the community can explore for evaluating generated crash-inducing bug patches.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82323767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving random GUI testing with image-based widget detection","authors":"Thomas D. White, G. Fraser, Guy J. Brown","doi":"10.1145/3293882.3330551","DOIUrl":"https://doi.org/10.1145/3293882.3330551","url":null,"abstract":"Graphical User Interfaces (GUIs) are amongst the most common user interfaces, enabling interactions with applications through mouse movements and key presses. Tools for automated testing of programs through their GUI exist, however they usually rely on operating system or framework specific knowledge to interact with an application. Due to frequent operating system updates, which can remove required information, and a large variety of different GUI frameworks using unique underlying data structures, such tools rapidly become obsolete, Consequently, for an automated GUI test generation tool, supporting many frameworks and operating systems is impractical. We propose a technique for improving GUI testing by automatically identifying GUI widgets in screen shots using machine learning techniques. As training data, we generate randomized GUIs to automatically extract widget information. The resulting model provides guidance to GUI testing tools in environments not currently supported by deriving GUI widget information from screen shots only. In our experiments, we found that identifying GUI widgets in screen shots and using this information to guide random testing achieved a significantly higher branch coverage in 18 of 20 applications, with an average increase of 42.5% when compared to conventional random testing.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85862991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cong Wang, Jian Gao, Yu Jiang, Zhenchang Xing, Huafeng Zhang, Weiliang Yin, M. Gu, Jiaguang Sun
{"title":"Go-clone: graph-embedding based clone detector for Golang","authors":"Cong Wang, Jian Gao, Yu Jiang, Zhenchang Xing, Huafeng Zhang, Weiliang Yin, M. Gu, Jiaguang Sun","doi":"10.1145/3293882.3338996","DOIUrl":"https://doi.org/10.1145/3293882.3338996","url":null,"abstract":"Golang (short for Go programming language) is a fast and compiled language, which has been increasingly used in industry due to its excellent performance on concurrent programming. Golang redefines concurrent programming grammar, making it a challenge for traditional clone detection tools and techniques. However, there exist few tools for detecting duplicates or copy-paste related bugs in Golang. Therefore, an effective and efficient code clone detector on Golang is especially needed. In this paper, we present Go-Clone, a learning-based clone detector for Golang. Go-Clone contains two modules -- the training module and the user interaction module. In the training module, firstly we parse Golang source code into llvm IR (Intermediate Representation). Secondly, we calculate LSFG (labeled semantic flow graph) for each program function automatically. Go-Clone trains a deep neural network model to encode LSFGs for similarity classification. In the user interaction module, users can choose one or more Golang projects. Go-Clone identifies and presents a list of function pairs, which are most likely clone code for user inspection. To evaluate Go-Clone's performance, we collect 6,110 commit versions from 48 Github projects to construct a Golang clone detection data set. Go-Clone can reach the value of AUC (Area Under Curve) and ACC (Accuracy) for 89.61% and 83.80% in clone detection. By testing several groups of unfamiliar data, we also demonstrates the generility of Go-Clone. The address of the abstract demo video: https://youtu.be/o5DogtYGbeo","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88308318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new dimension of test quality: assessing and generating higher quality unit test cases","authors":"Giovanni Grano","doi":"10.1145/3293882.3338984","DOIUrl":"https://doi.org/10.1145/3293882.3338984","url":null,"abstract":"Unit tests form the first defensive line against the introduction of bugs in software systems. Therefore, their quality is of a paramount importance to produce robust and reliable software. To assess test quality, many organizations relies on metrics like code and mutation coverage. However, they are not always optimal to fulfill such a purpose. In my research, I want to make mutation testing scalable by devising a lightweight approach to estimate test effectiveness. Moreover, I plan to introduce a new metric measuring test focus—as a proxy for the effort needed by developers to understand and maintain a test— that both complements code coverage to assess test quality and can be used to drive automated test case generation of higher quality tests.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86913511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Root causing flaky tests in a large-scale industrial setting","authors":"Wing Lam, Patrice Godefroid, Suman Nath, Anirudh Santhiar, Suresh Thummalapenta","doi":"10.1145/3293882.3330570","DOIUrl":"https://doi.org/10.1145/3293882.3330570","url":null,"abstract":"In today’s agile world, developers often rely on continuous integration pipelines to help build and validate their changes by executing tests in an efficient manner. One of the significant factors that hinder developers’ productivity is flaky tests—tests that may pass and fail with the same version of code. Since flaky test failures are not deterministically reproducible, developers often have to spend hours only to discover that the occasional failures have nothing to do with their changes. However, ignoring failures of flaky tests can be dangerous, since those failures may represent real faults in the production code. Furthermore, identifying the root cause of flakiness is tedious and cumbersome, since they are often a consequence of unexpected and non-deterministic behavior due to various factors, such as concurrency and external dependencies. As developers in a large-scale industrial setting, we first describe our experience with flaky tests by conducting a study on them. Our results show that although the number of distinct flaky tests may be low, the percentage of failing builds due to flaky tests can be substantial. To reduce the burden of flaky tests on developers, we describe our end-to-end framework that helps identify flaky tests and understand their root causes. Our framework instruments flaky tests and all relevant code to log various runtime properties, and then uses a preliminary tool, called RootFinder, to find differences in the logs of passing and failing runs. Using our framework, we collect and publicize a dataset of real-world, anonymized execution logs of flaky tests. By sharing the findings from our study, our framework and tool, and a dataset of logs, we hope to encourage more research on this important problem.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"80 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90366187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eran Yahav, Stephen J. Fink, Nurit Dor, G. Ramalingam, E. Geay
{"title":"From typestate verification to interpretable deep models (invited talk abstract)","authors":"Eran Yahav, Stephen J. Fink, Nurit Dor, G. Ramalingam, E. Geay","doi":"10.1145/3293882.3338992","DOIUrl":"https://doi.org/10.1145/3293882.3338992","url":null,"abstract":"The paper ``Effective Typestate Verification in the Presence of Aliasing'' was published in the International Symposium on Software Testing and Analysis (ISSTA) 2006 Proceedings, and has now been selected to receive the ISSTA 2019 Retrospective Impact Paper Award. The paper described a scalable framework for verification of typestate properties in real-world Java programs. The paper introduced several techniques that have been used widely in the static analysis of real-world programs. Specifically, it introduced an abstract domain combining access-paths, aliasing information, and typestate that turned out to be simple, powerful, and useful. We review the original paper and show the evolution of the ideas over the years. We show how some of these ideas have evolved into work on machine learning for code completion, and discuss recent general results in machine learning for programming.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"10 11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88454307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Albert, M. G. D. L. Banda, M. Gómez-Zamalloa, Miguel Isabel, Peter James Stuckey
{"title":"Optimal context-sensitive dynamic partial order reduction with observers","authors":"E. Albert, M. G. D. L. Banda, M. Gómez-Zamalloa, Miguel Isabel, Peter James Stuckey","doi":"10.1145/3293882.3330565","DOIUrl":"https://doi.org/10.1145/3293882.3330565","url":null,"abstract":"Dynamic Partial Order Reduction (DPOR) algorithms are used in stateless model checking to avoid the exploration of equivalent execution sequences. DPOR relies on the notion of independence between execution steps to detect equivalence. Recent progress in the area has introduced more accurate ways to detect independence: Context-Sensitive DPOR considers two steps p and t independent in the current state if the states obtained by executing p · t and t · p are the same; Optimal DPOR with Observers makes their dependency conditional to the existence of future events that observe their operations. We introduce a new algorithm, Optimal Context-Sensitive DPOR with Observers, that combines these two notions of conditional independence, and goes beyond them by exploiting their synergies. Experimental evaluation shows that our gains increase exponentially with the size of the considered inputs.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86961037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}