{"title":"PrIntFuzz: fuzzing Linux drivers via automated virtual device simulation","authors":"Zheyu Ma, Bodong Zhao, Letu Ren, Zheming Li, Siqi Ma, Xiapu Luo, Chao Zhang","doi":"10.1145/3533767.3534226","DOIUrl":"https://doi.org/10.1145/3533767.3534226","url":null,"abstract":"Linux drivers share the same address space and privilege with the core of the kernel but have a much larger code base and attack surface. The Linux drivers are not well tested and have weaker security guarantees than the kernel. Missing support from hardware devices, existing fuzzing solutions fail to cover a large portion of the driver code, e.g., the initialization code and interrupt handlers. In this paper, we present PrIntFuzz, an efficient and universal fuzzing framework that can test the overlooked driver code, including the PRobing code and INTerrupt handlers. PrIntFuzz first extracts knowledge from the driver through inter-procedural field-sensitive, path-sensitive, and flow-sensitive static analysis. Then it utilizes the information to build a flexible and efficient simulator, which supports device probing, hardware interrupts emulation and device I/O interception. Lastly, PrIntFuzz applies a multi-dimension fuzzing strategy to explore the overlooked code. We have developed a prototype of PrIntFuzz and successfully simulated 311 virtual PCI (Peripheral Component Interconnect) devices, 472 virtual I2C (Inter-Integrated Circuit) devices, 169 virtual USB (Universal Serial Bus) devices, and found 150 bugs in the corresponding device drivers. We have submitted patches for these bugs to the Linux kernel community, and 59 patches have been merged so far. In a control experiment of Linux 5.10-rc6, PrIntFuzz found 99 bugs, while the state-of-the-art fuzzer only found 50. PrIntFuzz covers 11,968 basic blocks on the latest Linux kernel, while the state-of-the-art fuzzer Syzkaller only covers 2,353 basic blocks.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134310267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Geunwoo Kim, Sanghyun Hong, M. Franz, Dokyung Song
{"title":"Improving cross-platform binary analysis using representation learning via graph alignment","authors":"Geunwoo Kim, Sanghyun Hong, M. Franz, Dokyung Song","doi":"10.1145/3533767.3534383","DOIUrl":"https://doi.org/10.1145/3533767.3534383","url":null,"abstract":"Cross-platform binary analysis requires a common representation of binaries across platforms, on which a specific analysis can be performed. Recent work proposed to learn low-dimensional, numeric vector representations (i.e., embeddings) of disassembled binary code, and perform binary analysis in the embedding space. Unfortunately, however, existing techniques fall short in that they are either (i) specific to a single platform producing embeddings not aligned across platforms, or (ii) not designed to capture the rich contextual information available in a disassembled binary. We present a novel deep learning-based method, XBA, which addresses the aforementioned problems. To this end, we first abstract binaries as typed graphs, dubbed binary disassembly graphs (BDGs), which encode control-flow and other rich contextual information of different entities found in a disassembled binary, including basic blocks, external functions called, and string literals referenced. We then formulate binary code representation learning as a graph alignment problem, i.e., finding the node correspondences between BDGs extracted from two binaries compiled for different platforms. XBA uses graph convolutional networks to learn the semantics of each node, (i) using its rich contextual information encoded in the BDG, and (ii) aligning its embeddings across platforms. Our formulation allows XBA to learn semantic alignments between two BDGs in a semi-supervised manner, requiring only a limited number of node pairs be aligned across platforms for training. Our evaluation shows that XBA can learn semantically-rich embeddings of binaries aligned across platforms without apriori platform-specific knowledge. By training our model only with 50% of the oracle alignments, XBA was able to predict, on average, 75% of the rest. Our case studies further show that the learned embeddings encode knowledge useful for cross-platform binary analysis.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"349 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116356014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PermDroid: automatically testing permission-related behaviour of Android applications","authors":"Shuaihao Yang, Zigang Zeng, Wei Song","doi":"10.1145/3533767.3534221","DOIUrl":"https://doi.org/10.1145/3533767.3534221","url":null,"abstract":"The Android runtime permission model allows users to grant and revoke permissions at runtime. To verify the robustness of apps, developers have to test the apps repeatedly under a wide range of permission combinations, which is time-consuming and unsuited for regression testing. Existing app testing techniques are of limited help in this context, as they seldom consider different permission combinations explicitly. To address this issue, we present PermDroid to automatically test the permission-related behaviour of apps with permissions granted/revoked dynamically. PermDroid first statically constructs a state transition graph (STG) for the app; it then utilizes the STG for the permission-directed exploration to test permission-related behaviour only under the combinations of the relevant permissions. The experimental results on 50 real-world Android apps demonstrate the effectiveness and efficiency of PermDroid: the average permission-related API invocation coverage achieves 72.38% in 10 minutes, and seven permission-related bugs are uncovered, six of which are not detected by the competitors.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121792714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frank Busse, Pritam M. Gharat, Cristian Cadar, A. Donaldson
{"title":"Combining static analysis error traces with dynamic symbolic execution (experience paper)","authors":"Frank Busse, Pritam M. Gharat, Cristian Cadar, A. Donaldson","doi":"10.1145/3533767.3534384","DOIUrl":"https://doi.org/10.1145/3533767.3534384","url":null,"abstract":"This paper reports on our experience implementing a technique for sifting through static analysis reports using dynamic symbolic execution. Our insight is that if a static analysis tool produces a partial trace through the program under analysis, annotated with conditions that the analyser believes are important for the bug to trigger, then a dynamic symbolic execution tool may be able to exploit the trace by (a) guiding the search heuristically so that paths that follow the trace most closely are prioritised for exploration, and (b) pruning the search using the conditions associated with each step of the trace. This may allow the bug to be quickly confirmed using dynamic symbolic execution, if it turns out to be a true positive, yielding an input that triggers the bug. To experiment with this approach, we have implemented the idea in a tool chain that allows the popular open-source static analysis tools Clang Static Analyzer (CSA) and Infer to be combined with the popular open-source dynamic symbolic execution engine KLEE. Our findings highlight two interesting negative results. First, while fault injection experiments show the promise of our technique, they also reveal that the traces provided by static analysis tools are not that useful in guiding search. Second, we have systematically applied CSA and Infer to a large corpus of real-world applications that are suitable for analysis with KLEE, and find that the static analysers are rarely able to find non-trivial true positive bugs for this set of applications. We believe our case study can inform static analysis and dynamic symbolic execution tool developers as to where improvements may be necessary, and serve as a call to arms for researchers interested in combining symbolic execution and static analysis to identify more suitable benchmark suites for evaluation of research ideas.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130378250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jialu Zhang, Todd Mytkowicz, Mike Kaufman, R. Piskac, Shuvendu K. Lahiri
{"title":"Using pre-trained language models to resolve textual and semantic merge conflicts (experience paper)","authors":"Jialu Zhang, Todd Mytkowicz, Mike Kaufman, R. Piskac, Shuvendu K. Lahiri","doi":"10.1145/3533767.3534396","DOIUrl":"https://doi.org/10.1145/3533767.3534396","url":null,"abstract":"Program merging is standard practice when developers integrate their individual changes to a common code base. When the merge algorithm fails, this is called a merge conflict. The conflict either manifests as a textual merge conflict where the merge fails to produce code, or as a semantic merge conflict where the merged code results in compiler errors or broken tests. Resolving these conflicts for large code projects is expensive because it requires developers to manually identify the sources of conflicts and correct them. In this paper, we explore the feasibility of automatically repairing merge conflicts (both textual and semantic) using k-shot learning with pre-trained large neural language models (LM) such as GPT-3. One of the challenges in leveraging such language models is fitting the examples and the queries within a small prompt (2048 tokens). We evaluate LMs and k-shot learning for both textual and semantic merge conflicts for Microsoft Edge. Our results are mixed: on one-hand, LMs provide the state-of-the-art (SOTA) performance on semantic merge conflict resolution for Edge compared to earlier symbolic approaches; on the other hand, LMs do not yet obviate the benefits of special purpose domain-specific languages (DSL) for restricted patterns for program synthesis.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129969123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An empirical study on the effectiveness of static C code analyzers for vulnerability detection","authors":"Stephan Lipp, Sebastian Banescu, A. Pretschner","doi":"10.1145/3533767.3534380","DOIUrl":"https://doi.org/10.1145/3533767.3534380","url":null,"abstract":"Static code analysis is often used to scan source code for security vulnerabilities. Given the wide range of existing solutions implementing different analysis techniques, it is very challenging to perform an objective comparison between static analysis tools to determine which ones are most effective at detecting vulnerabilities. Existing studies are thereby limited in that (1) they use synthetic datasets, whose vulnerabilities do not reflect the complexity of security bugs that can be found in practice and/or (2) they do not provide differentiated analyses w.r.t. the types of vulnerabilities output by the static analyzers. Hence, their conclusions about an analyzer's capability to detect vulnerabilities may not generalize to real-world programs. In this paper, we propose a methodology for automatically evaluating the effectiveness of static code analyzers based on CVE reports. We evaluate five free and open-source and one commercial static C code analyzer(s) against 27 software projects containing a total of 1.15 million lines of code and 192 vulnerabilities (ground truth). While static C analyzers have been shown to perform well in benchmarks with synthetic bugs, our results indicate that state-of-the-art tools miss in-between 47% and 80% of the vulnerabilities in a benchmark set of real-world programs. Moreover, our study finds that this false negative rate can be reduced to 30% to 69% when combining the results of static analyzers, at the cost of 15 percentage points more functions flagged. Many vulnerabilities hence remain undetected, especially those beyond the classical memory-related security bugs.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122494996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Li, R. Chen, Boxiang Wang, Tingting Yu, Dongdong Gao, Mengfei Yang
{"title":"Precise and efficient atomicity violation detection for interrupt-driven programs via staged path pruning","authors":"Chao Li, R. Chen, Boxiang Wang, Tingting Yu, Dongdong Gao, Mengfei Yang","doi":"10.1145/3533767.3534412","DOIUrl":"https://doi.org/10.1145/3533767.3534412","url":null,"abstract":"Interrupt-driven programs are widely used in aerospace and other safety-critical areas. However, uncertain interleaving execution of interrupts may cause concurrency bugs, which could result in serious safety problems. Most of the previous researches tackling the detection of interrupt concurrency bugs focus on data races, that are usually benign as shown in empirical studies. Some studies focus on pattern-based atomicity violations that are most likely harmful. However, they cannot achieve simultaneous high precision and scalability. This paper presents intAtom, a precise and efficient static detection technique for interrupt atomicity violations, described by access interleaving pattern. The key point is that it eliminates false violations by staged path pruning with constraint solving. It first identifies all the violation candidates using data flow analysis and access interleaving pattern matching. intAtom then analyzes the path feasibility between two consecutive accesses in preempted task/interrupt, in order to recognize the atomicity intention of developers, with the help of which it filters out some candidates. Finally, it performs a modular path pruning by constructing symbolic summary and representative preemption points selection to eliminate the infeasible path in concurrent context efficiently. All the path feasibility checking processes are based on sparse value-flow analysis, which makes intAtom scalable. intAtom is evaluated on a benchmark and 6 real-world aerospace embedded programs. The experimental results show that intAtom reduces the false positive by 72% and improves the detection speed by 3 times, compared to the state-of-the-art methods. Furthermore, it can finish analyzing the real-world aerospace embedded software very fast with an average FP rate of 19.6%, while finding 19 bugs that were confirmed by developers.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124010276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"QMutPy: a mutation testing tool for Quantum algorithms and applications in Qiskit","authors":"D. Fortunato, José Campos, Rui Abreu","doi":"10.1145/3533767.3543296","DOIUrl":"https://doi.org/10.1145/3533767.3543296","url":null,"abstract":"There is an inherent lack of knowledge and technology to test a quantum program properly. In this paper, building on the definition of syntactically equivalent quantum gates, we describe our efforts in developing a tool, coined QMutPy, leveraging the well-known open-source mutation tool MutPy. We further discuss the design and implementation of QMutPy, and the usage of a novel set of mutation operators that generate mutants for qubit measurements and gates. To evaluate QMutPy’s performance, we conducted a preliminary study on 11 real quantum programs written in the IBM’s Qiskit library. QMutPy has proven to be an effective quantum mutation tool, providing insight into the current state of quantum tests. QMutPy is publicly available at https://github.com/danielfobooss/mutpy. Tool demo: https://youtu.be/fC4tOY5trqc.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124141724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A large-scale study of usability criteria addressed by static analysis tools","authors":"Marcus Nachtigall, Michael Schlichtig, E. Bodden","doi":"10.1145/3533767.3534374","DOIUrl":"https://doi.org/10.1145/3533767.3534374","url":null,"abstract":"Static analysis tools support developers in detecting potential coding issues, such as bugs or vulnerabilities. Research on static analysis emphasizes its technical challenges but also mentions severe usability shortcomings. These shortcomings hinder the adoption of static analysis tools, and in some cases, user dissatisfaction even leads to tool abandonment. To comprehensively assess the current state of the art, this paper presents the first systematic usability evaluation in a wide range of static analysis tools. We derived a set of 36 relevant criteria from the scientific literature and gathered a collection of 46 static analysis tools complying with our inclusion and exclusion criteria - a representative set of mainly non-proprietary tools. Then, we evaluated how well these tools fulfill the aforementioned criteria. The evaluation shows that more than half of the considered tools offer poor warning messages, while about three-quarters of the tools provide hardly any fix support. Furthermore, the integration of user knowledge is strongly neglected, which could be used for improved handling of false positives and tuning the results for the corresponding developer. Finally, issues regarding workflow integration and specialized user interfaces are proved further. These findings should prove useful in guiding and focusing further research and development in the area of user experience for static code analyses.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115293214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic generation of smoke test suites for kubernetes","authors":"C. Cannavacciuolo, L. Mariani","doi":"10.1145/3533767.3543298","DOIUrl":"https://doi.org/10.1145/3533767.3543298","url":null,"abstract":"Setting up a reliable and automated testing process can be challenging in a cloud environment, due to the many ways automatic and repeated system deployment may unexpectedly fail. Imperfect deployments may cause spurious test failures, resulting in a waste of test resources and effort. To address this issue, developers can implement smoke test suites, which are shallow test suites that are executed before any other test suite to verify that the system under test is fully operational, and can be thus reliably tested. This paper presents KubeSmokeTester, a tool that alleviates the effort necessary to implement smoke test suites by providing automated generation capabilities for Kubernetes applications. The tool has been experienced with 60 versions of two industrial systems demonstrating its suitability in anticipating spurious test failures.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126874813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}