{"title":"Identification and Mitigation of Toxic Communications Among Open Source Software Developers","authors":"Jaydeb Sarker","doi":"10.1145/3551349.3559570","DOIUrl":"https://doi.org/10.1145/3551349.3559570","url":null,"abstract":"Toxic and unhealthy conversations during the developer’s communication may reduce the professional harmony and productivity of Free and Open Source Software (FOSS) projects. For example, toxic code review comments may raise pushback from an author to complete suggested changes. A toxic communication with another person may hamper future communication and collaboration. Research also suggests that toxicity disproportionately impacts newcomers, women, and other participants from marginalized groups. Therefore, toxicity is a barrier to promote diversity, equity, and inclusion. Since the occurrence of toxic communications is not uncommon among FOSS communities and such communications may have serious repercussions, the primary objective of my proposed dissertation is to automatically identify and mitigate toxicity during developers’ textual interactions. On this goal, I aim to: i) build an automated toxicity detector for Software Engineering (SE) domain, ii) identify the notion of toxicity across demographics, and iii) analyze the impacts of toxicity on the outcomes of Open Source Software (OSS) projects.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129667388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Stocco, Paulo J. Nunes, Marcelo d’Amorim, P. Tonella
{"title":"ThirdEye: Attention Maps for Safe Autonomous Driving Systems","authors":"Andrea Stocco, Paulo J. Nunes, Marcelo d’Amorim, P. Tonella","doi":"10.1145/3551349.3556968","DOIUrl":"https://doi.org/10.1145/3551349.3556968","url":null,"abstract":"Automated online recognition of unexpected conditions is an indispensable component of autonomous vehicles to ensure safety even in unknown and uncertain situations. In this paper we propose a runtime monitoring technique rooted in the attention maps computed by explainable artificial intelligence techniques. Our approach, implemented in a tool called ThirdEye, turns attention maps into confidence scores that are used to discriminate safe from unsafe driving behaviours. The intuition is that uncommon attention maps are associated with unexpected runtime conditions. In our empirical study, we evaluated the effectiveness of different configurations of ThirdEye at predicting simulation-based injected failures induced by both unknown conditions (adverse weather and lighting) and unsafe/uncertain conditions created with mutation testing. Results show that, overall, ThirdEye can predict 98% misbehaviours, up to three seconds in advance, outperforming a state-of-the-art failure predictor for autonomous vehicles.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129670895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Coverage-guided Greybox Fuzzing based on Power Schedule Optimization with Time Complexity","authors":"Jinfu Chen, Shengran Wang, Saihua Cai, Chi Zhang, Haibo Chen, Jingyi Chen, Jianming Zhang","doi":"10.1145/3551349.3559550","DOIUrl":"https://doi.org/10.1145/3551349.3559550","url":null,"abstract":"Coverage-guided Greybox fuzzing is regarded as a practical approach to detect software vulnerabilities, which targets to expand code coverage as much as possible. A common implementation is to assign more energy to such seeds which find new edges with less execution time. However, solely considering new edges may be less effective because some hard-to-find branches often exist in the complex code of program. Code complexity is one of the key indicators to measure the code security. Compared to the code with simple structure, the program with higher code complexity is more likely to find more branches and cause security problems. In this paper, we propose a novel fuzzing method which further uses code complexity to optimize power schedule process in AFL (American Fuzzy Lop) and AFLFAST (American Fuzzy Lop Fast). The goal of our method is to generate inputs which are more biased toward the code with higher complexity of the program under test. In addition, we conduct a preliminary empirical study under three widely used real-world programs, and the experimental results show that the proposed approach can trigger more crashes as well as improve the coverage discovery.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129166687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vikas K. Malviya, Chee Wei Leow, Ashok Kasthuri, Yan Naing Tun, Lwin Khin Shar, Lingxiao Jiang
{"title":"Right to Know, Right to Refuse: Towards UI Perception-Based Automated Fine-Grained Permission Controls for Android Apps","authors":"Vikas K. Malviya, Chee Wei Leow, Ashok Kasthuri, Yan Naing Tun, Lwin Khin Shar, Lingxiao Jiang","doi":"10.1145/3551349.3559556","DOIUrl":"https://doi.org/10.1145/3551349.3559556","url":null,"abstract":"It is the basic right of a user to know how the permissions are used within the Android app’s scope and to refuse the app if granted permissions are used for the activities other than specified use which can amount to malicious behavior. This paper proposes an approach and a vision to automatically model the permissions necessary for Android apps from users’ perspective and enable fine-grained permission controls by users, thus facilitating users in making more well-informed and flexible permission decisions for different app functionalities, which in turn improve the security and data privacy of the App and enforce apps to reduce permission misuses. Our proposed approach works in mainly two stages. First, it looks for discrepancies between the permission uses perceivable by users and the permissions actually used by apps via program analysis techniques. Second, it runs prediction algorithms using machine learning techniques to catch the discrepancies in permission usage and thereby alert the user for action about data violation. We have evaluated preliminary implementations of our approach and achieved promising fine-grained permission control accuracy. In addition to the benefits of users’ privacy protection, we envision that wider adoption of the approach may also enforce better privacy-aware design by responsible bodies such as app developers, governments, and enterprises.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125528030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Robust Models of Code via Energy-Based Learning on Auxiliary Datasets","authors":"Nghi D. Q. Bui, Yijun Yu","doi":"10.1145/3551349.3561171","DOIUrl":"https://doi.org/10.1145/3551349.3561171","url":null,"abstract":"Existing approaches to improving the robustness of source code models concentrate on recognizing adversarial samples rather than valid samples that fall outside of a given distribution, which we refer to as out-of-distribution (OOD) samples. To this end, we propose to use an auxiliary dataset (out-of-distribution) such that, when trained together with the main dataset, they will enhance the model’s robustness. We adapt energy-bounded learning objective function to assign a higher score to in-distribution samples and a lower score to out-of-distribution samples in order to incorporate such out-of-distribution samples into the training process of source code models. In terms of OOD detection and adversarial samples detection, our evaluation results demonstrate a greater robustness for existing source code models to become more accurate at recognizing OOD data while being more resistant to adversarial attacks at the same time.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120935999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gang Wang, Zan Wang, Junjie Chen, Xiang Chen, Ming Yan
{"title":"An Empirical Study on Numerical Bugs in Deep Learning Programs","authors":"Gang Wang, Zan Wang, Junjie Chen, Xiang Chen, Ming Yan","doi":"10.1145/3551349.3559561","DOIUrl":"https://doi.org/10.1145/3551349.3559561","url":null,"abstract":"The task of a deep learning (DL) program is to train a model with high precision and apply it to different scenarios. A DL program often involves massive numerical calculations. Therefore, the robustness and stability of the numerical calculations are dominant in the quality of DL programs. Indeed, numerical bugs are common in DL programs, producing NaN (Not-a-Number) and INF (Infinite). A numerical bug may render the DL models inaccurate, causing the DL applications unusable. In this work, we conduct the first empirical study on numerical bugs in DL programs by analyzing the programs implemented on the top of two popular DL libraries (i.e., TensorFlow and PyTorch). Specifically, We collect a dataset of 400 numerical bugs in DL programs. Then, we classify these numerical bugs into nine categories based on their root causes and summarize two findings. Finally, we provide the implications of our study on detecting numerical bugs in DL programs.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126890126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design-Space Exploration for Decision-Support Software","authors":"A. Penders, A. Varbanescu, G. Pavlin, H. Sips","doi":"10.1145/3551349.3559502","DOIUrl":"https://doi.org/10.1145/3551349.3559502","url":null,"abstract":"Presence monitoring or intrusion detection in a location/area are examples of decision-support applications. Decision-support applications are applications where monitoring is used to collect (heterogeneous) data and create situational awareness, which further requires decisions and/or actions. As such, decision-support software consists of different interconnected components with very diverse roles, whose communication and synchronization are essential for the application functionality and performance. Despite this complexity, software design for decision-support is often driven by short-term functional requirements and only supported by designers’ previous experience. In the current non-systematic approach, mistakes can be easily made, and can be very difficult to repair. In this work, we describe our systematic method for efficient and effective decision-support software design, based on application design-space exploration (DSE). To this end, we describe how to build a design space and present structured methods to traverse the design space towards software solutions that meet user requirements for both functionality and performance.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"91 12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126028652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yutao Hu, Deqing Zou, Junru Peng, Yueming Wu, Junjie Shan, Hai Jin
{"title":"TreeCen: Building Tree Graph for Scalable Semantic Code Clone Detection","authors":"Yutao Hu, Deqing Zou, Junru Peng, Yueming Wu, Junjie Shan, Hai Jin","doi":"10.1145/3551349.3556927","DOIUrl":"https://doi.org/10.1145/3551349.3556927","url":null,"abstract":"Code clone detection is an important research problem that has attracted wide attention in software engineering. Many methods have been proposed for detecting code clone, among which text-based and token-based approaches are scalable but lack consideration of code semantics, thus resulting in the inability to detect semantic code clones. Methods based on intermediate representations of codes can solve the problem of semantic code clone detection. However, graph-based methods are not practicable due to code compilation, and existing tree-based approaches are limited by the scale of trees for scalable code clone detection. In this paper, we propose TreeCen, a scalable tree-based code clone detector, which satisfies scalability while detecting semantic clones effectively. Given the source code of a method, we first extract its abstract syntax tree (AST) based on static analysis and transform it into a simple graph representation (i.e., tree graph) according to the node type, rather than using traditional heavyweight tree matching. We then treat the tree graph as a social network and adopt centrality analysis on each node to maintain the tree details. By this, the original complex tree can be converted into a 72-dimensional vector while containing comprehensive structural information of the AST. Finally, these vectors are fed into a machine learning model to train a detector and use it to find code clones. We conduct comparative evaluations on effectiveness and scalability. The experimental results show that TreeCen maintains the best performance of the other six state-of-the-art methods (i.e., SourcererCC, RtvNN, DeepSim, SCDetector, Deckard, and ASTNN) with F1 scores of 0.99 and 0.95 on BigCloneBench and Google Code Jam datasets, respectively. In terms of scalability, TreeCen is about 79 times faster than the other state-of-the-art tree-based semantic code clone detector (ASTNN), about 13 times faster than the fastest graph-based approach (SCDetector), and even about 22 times faster than the one-time trained token-based detector (RtvNN).","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127303005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yedi Zhang, Zhe Zhao, Fu Song, M. Zhang, Tao Chen, Jun Sun
{"title":"QVIP: An ILP-based Formal Verification Approach for Quantized Neural Networks","authors":"Yedi Zhang, Zhe Zhao, Fu Song, M. Zhang, Tao Chen, Jun Sun","doi":"10.1145/3551349.3556916","DOIUrl":"https://doi.org/10.1145/3551349.3556916","url":null,"abstract":"Deep learning has become a promising programming paradigm in software development, owing to its surprising performance in solving many challenging tasks. Deep neural networks (DNNs) are increasingly being deployed in practice, but are limited on resource-constrained devices owing to their demand for computational power. Quantization has emerged as a promising technique to reduce the size of DNNs with comparable accuracy as their floating-point numbered counterparts. The resulting quantized neural networks (QNNs) can be implemented energy-efficiently. Similar to their floating-point numbered counterparts, quality assurance techniques for QNNs, such as testing and formal verification, are essential but are currently less explored. In this work, we propose a novel and efficient formal verification approach for QNNs. In particular, we are the first to propose an encoding that reduces the verification problem of QNNs into the solving of integer linear constraints, which can be solved using off-the-shelf solvers. Our encoding is both sound and complete. We demonstrate the application of our approach on local robustness verification and maximum robustness radius computation. We implement our approach in a prototype tool QVIP and conduct a thorough evaluation. Experimental results on QNNs with different quantization bits confirm the effectiveness and efficiency of our approach, e.g., two orders of magnitude faster and able to solve more verification tasks in the same time limit than the state-of-the-art methods.","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133566292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhuo Zhang, Yan Lei, Meng Yan, Yue Yu, Jiachi Chen, Shangwen Wang, Xiaoguang Mao
{"title":"Reentrancy Vulnerability Detection and Localization: A Deep Learning Based Two-phase Approach","authors":"Zhuo Zhang, Yan Lei, Meng Yan, Yue Yu, Jiachi Chen, Shangwen Wang, Xiaoguang Mao","doi":"10.1145/3551349.3560428","DOIUrl":"https://doi.org/10.1145/3551349.3560428","url":null,"abstract":"Smart contracts have been widely and rapidly used to automate financial and business transactions together with blockchains, helping people make agreements while minimizing trusts. With millions of smart contracts deployed on blockchain, various bugs and vulnerabilities in smart contracts have emerged. Following the rapid development of deep learning, many recent studies have used deep learning for vulnerability detection to conduct security checks before deploying smart contracts. These approaches show effective results on detecting whether a smart contract is vulnerable or not whereas their results on locating suspicious statements responsible for the detected vulnerability are still unsatisfactory. To address this problem, we propose a deep learning based two-phase smart contract debugger for reentrancy vulnerability, one of the most severe vulnerabilities, named as ReVulDL: Reentrancy Vulnerability Detection and Localization. ReVulDL integrates the vulnerability detection and localization into a unified debugging pipeline. For the detection phase, given a smart contract, ReVulDL uses a graph-based pre-training model to learn the complex relationships in propagation chains for detecting whether the smart contract contains a reentrancy vulnerability. For the localization phase, if a reentrancy vulnerability is detected, ReVulDL utilizes interpretable machine learning to locate the suspicious statements in smart contract to provide interpretations of the detected vulnerability. Our large-scale empirical study on 47,398 smart contracts shows that ReVulDL achieves promising results in detecting reentrancy vulnerabilities (e.g., outperforming 16 state-of-the-art vulnerability detection approaches) and locating vulnerable statements (e.g., 70.38% of the vulnerable statements are ranked within Top-10).","PeriodicalId":197939,"journal":{"name":"Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133641172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}