{"title":"FAST: Boosting Uncertainty-based Test Prioritization Methods for Neural Networks via Feature Selection","authors":"Jialuo Chen, Jingyi Wang, Xiyue Zhang, Youcheng Sun, Marta Kwiatkowska, Jiming Chen, Peng Cheng","doi":"arxiv-2409.09130","DOIUrl":"https://doi.org/arxiv-2409.09130","url":null,"abstract":"Due to the vast testing space, the increasing demand for effective and\u0000efficient testing of deep neural networks (DNNs) has led to the development of\u0000various DNN test case prioritization techniques. However, the fact that DNNs\u0000can deliver high-confidence predictions for incorrectly predicted examples,\u0000known as the over-confidence problem, causes these methods to fail to reveal\u0000high-confidence errors. To address this limitation, in this work, we propose\u0000FAST, a method that boosts existing prioritization methods through guided\u0000FeAture SelecTion. FAST is based on the insight that certain features may\u0000introduce noise that affects the model's output confidence, thereby\u0000contributing to high-confidence errors. It quantifies the importance of each\u0000feature for the model's correct predictions, and then dynamically prunes the\u0000information from the noisy features during inference to derive a new\u0000probability vector for the uncertainty estimation. With the help of FAST, the\u0000high-confidence errors and correctly classified examples become more\u0000distinguishable, resulting in higher APFD (Average Percentage of Fault\u0000Detection) values for test prioritization, and higher generalization ability\u0000for model enhancement. We conduct extensive experiments to evaluate FAST across\u0000a diverse set of model structures on multiple benchmark datasets to validate\u0000the effectiveness, efficiency, and scalability of FAST compared to the\u0000state-of-the-art prioritization techniques.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Altaf Allah Abbassi, Houssem Ben Braiek, Foutse Khomh, Thomas Reid
{"title":"Trimming the Risk: Towards Reliable Continuous Training for Deep Learning Inspection Systems","authors":"Altaf Allah Abbassi, Houssem Ben Braiek, Foutse Khomh, Thomas Reid","doi":"arxiv-2409.09108","DOIUrl":"https://doi.org/arxiv-2409.09108","url":null,"abstract":"The industry increasingly relies on deep learning (DL) technology for\u0000manufacturing inspections, which are challenging to automate with rule-based\u0000machine vision algorithms. DL-powered inspection systems derive defect patterns\u0000from labeled images, combining human-like agility with the consistency of a\u0000computerized system. However, finite labeled datasets often fail to encompass\u0000all natural variations necessitating Continuous Training (CT) to regularly\u0000adjust their models with recent data. Effective CT requires fresh labeled\u0000samples from the original distribution; otherwise, selfgenerated labels can\u0000lead to silent performance degradation. To mitigate this risk, we develop a\u0000robust CT-based maintenance approach that updates DL models using reliable data\u0000selections through a two-stage filtering process. The initial stage filters out\u0000low-confidence predictions, as the model inherently discredits them. The second\u0000stage uses variational auto-encoders and histograms to generate image\u0000embeddings that capture latent and pixel characteristics, then rejects the\u0000inputs of substantially shifted embeddings as drifted data with erroneous\u0000overconfidence. Then, a fine-tuning of the original DL model is executed on the\u0000filtered inputs while validating on a mixture of recent production and original\u0000datasets. This strategy mitigates catastrophic forgetting and ensures the model\u0000adapts effectively to new operational conditions. Evaluations on industrial\u0000inspection systems for popsicle stick prints and glass bottles using critical\u0000real-world datasets showed less than 9% of erroneous self-labeled data are\u0000retained after filtering and used for fine-tuning, improving model performance\u0000on production data by up to 14% without compromising its results on original\u0000validation data.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ROCAS: Root Cause Analysis of Autonomous Driving Accidents via Cyber-Physical Co-mutation","authors":"Shiwei Feng, Yapeng Ye, Qingkai Shi, Zhiyuan Cheng, Xiangzhe Xu, Siyuan Cheng, Hongjun Choi, Xiangyu Zhang","doi":"arxiv-2409.07774","DOIUrl":"https://doi.org/arxiv-2409.07774","url":null,"abstract":"As Autonomous driving systems (ADS) have transformed our daily life, safety\u0000of ADS is of growing significance. While various testing approaches have\u0000emerged to enhance the ADS reliability, a crucial gap remains in understanding\u0000the accidents causes. Such post-accident analysis is paramount and beneficial\u0000for enhancing ADS safety and reliability. Existing cyber-physical system (CPS)\u0000root cause analysis techniques are mainly designed for drones and cannot handle\u0000the unique challenges introduced by more complex physical environments and deep\u0000learning models deployed in ADS. In this paper, we address the gap by offering\u0000a formal definition of ADS root cause analysis problem and introducing ROCAS, a\u0000novel ADS root cause analysis framework featuring cyber-physical co-mutation.\u0000Our technique uniquely leverages both physical and cyber mutation that can\u0000precisely identify the accident-trigger entity and pinpoint the\u0000misconfiguration of the target ADS responsible for an accident. We further\u0000design a differential analysis to identify the responsible module to reduce\u0000search space for the misconfiguration. We study 12 categories of ADS accidents\u0000and demonstrate the effectiveness and efficiency of ROCAS in narrowing down\u0000search space and pinpointing the misconfiguration. We also show detailed case\u0000studies on how the identified misconfiguration helps understand rationale\u0000behind accidents.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Mixed-Methods Study of Open-Source Software Maintainers On Vulnerability Management and Platform Security Features","authors":"Jessy Ayala, Yu-Jye Tung, Joshua Garcia","doi":"arxiv-2409.07669","DOIUrl":"https://doi.org/arxiv-2409.07669","url":null,"abstract":"In open-source software (OSS), software vulnerabilities have significantly\u0000increased. Although researchers have investigated the perspectives of\u0000vulnerability reporters and OSS contributor security practices, understanding\u0000the perspectives of OSS maintainers on vulnerability management and platform\u0000security features is currently understudied. In this paper, we investigate the\u0000perspectives of OSS maintainers who maintain projects listed in the GitHub\u0000Advisory Database. We explore this area by conducting two studies: identifying\u0000aspects through a listing survey ($n_1=80$) and gathering insights from\u0000semi-structured interviews ($n_2=22$). Of the 37 identified aspects, we find\u0000that supply chain mistrust and lack of automation for vulnerability management\u0000are the most challenging, and barriers to adopting platform security features\u0000include a lack of awareness and the perception that they are not necessary.\u0000Surprisingly, we find that despite being previously vulnerable, some\u0000maintainers still allow public vulnerability reporting, or ignore reports\u0000altogether. Based on our findings, we discuss implications for OSS platforms\u0000and how the research community can better support OSS vulnerability management\u0000efforts.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"96 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An Guo, Yuan Zhou, Haoxiang Tian, Chunrong Fang, Yunjian Sun, Weisong Sun, Xinyu Gao, Anh Tuan Luu, Yang Liu, Zhenyu Chen
{"title":"SoVAR: Building Generalizable Scenarios from Accident Reports for Autonomous Driving Testing","authors":"An Guo, Yuan Zhou, Haoxiang Tian, Chunrong Fang, Yunjian Sun, Weisong Sun, Xinyu Gao, Anh Tuan Luu, Yang Liu, Zhenyu Chen","doi":"arxiv-2409.08081","DOIUrl":"https://doi.org/arxiv-2409.08081","url":null,"abstract":"Autonomous driving systems (ADSs) have undergone remarkable development and\u0000are increasingly employed in safety-critical applications. However, recently\u0000reported data on fatal accidents involving ADSs suggests that the desired level\u0000of safety has not yet been fully achieved. Consequently, there is a growing\u0000need for more comprehensive and targeted testing approaches to ensure safe\u0000driving. Scenarios from real-world accident reports provide valuable resources\u0000for ADS testing, including critical scenarios and high-quality seeds. However,\u0000existing scenario reconstruction methods from accident reports often exhibit\u0000limited accuracy in information extraction. Moreover, due to the diversity and\u0000complexity of road environments, matching current accident information with the\u0000simulation map data for reconstruction poses significant challenges. In this\u0000paper, we design and implement SoVAR, a tool for automatically generating\u0000road-generalizable scenarios from accident reports. SoVAR utilizes\u0000well-designed prompts with linguistic patterns to guide the large language\u0000model in extracting accident information from textual data. Subsequently, it\u0000formulates and solves accident-related constraints in conjunction with the\u0000extracted accident information to generate accident trajectories. Finally,\u0000SoVAR reconstructs accident scenarios on various map structures and converts\u0000them into test scenarios to evaluate its capability to detect defects in\u0000industrial ADSs. We experiment with SoVAR, using accident reports from the\u0000National Highway Traffic Safety Administration's database to generate test\u0000scenarios for the industrial-grade ADS Apollo. The experimental findings\u0000demonstrate that SoVAR can effectively generate generalized accident scenarios\u0000across different road structures. Furthermore, the results confirm that SoVAR\u0000identified 5 distinct safety violation types that contributed to the crash of\u0000Baidu Apollo.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards regulatory compliant lifecycle for AI-based medical devices in EU: Industry perspectives","authors":"Tuomas Granlund, Vlad Stirbu, Tommi Mikkonen","doi":"arxiv-2409.08006","DOIUrl":"https://doi.org/arxiv-2409.08006","url":null,"abstract":"Despite the immense potential of AI-powered medical devices to revolutionize\u0000healthcare, concerns regarding their safety in life-critical applications\u0000remain. While the European regulatory framework provides a comprehensive\u0000approach to medical device software development, it falls short in addressing\u0000AI-specific considerations. This article proposes a model to bridge this gap by\u0000extending the general idea of AI lifecycle with regulatory activities relevant\u0000to AI-enabled medical systems.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Handling expression evaluation under interference","authors":"Ian J. Hayes, Cliff B. Jones, Larissa A. Meinicke","doi":"arxiv-2409.07741","DOIUrl":"https://doi.org/arxiv-2409.07741","url":null,"abstract":"Hoare-style inference rules for program constructs permit the copying of\u0000expressions and tests from program text into logical contexts. It is known that\u0000this requires care even for sequential programs but further issues arise for\u0000concurrent programs because of potential interference to the values of\u0000variables. The \"rely-guarantee\" approach does tackle the issue of recording\u0000acceptable interference and offers a way to provide safe inference rules. This\u0000paper shows how the algebraic presentation of rely-guarantee ideas can clarify\u0000and formalise the conditions for safely re-using expressions and tests from\u0000program text in logical contexts for reasoning about programs.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Deep Dive Into How Open-Source Project Maintainers Review and Resolve Bug Bounty Reports","authors":"Jessy Ayala, Steven Ngo, Joshua Garcia","doi":"arxiv-2409.07670","DOIUrl":"https://doi.org/arxiv-2409.07670","url":null,"abstract":"Researchers have investigated the bug bounty ecosystem from the lens of\u0000platforms, programs, and bug hunters. Understanding the perspectives of bug\u0000bounty report reviewers, especially those who historically lack a security\u0000background and little to no funding for bug hunters, is currently understudied.\u0000In this paper, we primarily investigate the perspective of open-source software\u0000(OSS) maintainers who have used texttt{huntr}, a bug bounty platform that pays\u0000bounties to bug hunters who find security bugs in GitHub projects and have had\u0000valid vulnerabilities patched as a result. We address this area by conducting\u0000three studies: identifying characteristics through a listing survey ($n_1=51$),\u0000their ranked importance with Likert-scale survey data ($n_2=90$), and\u0000conducting semi-structured interviews to dive deeper into real-world\u0000experiences ($n_3=17$). As a result, we categorize 40 identified\u0000characteristics into benefits, challenges, helpful features, and wanted\u0000features. We find that private disclosure and project visibility are the most\u0000important benefits, while hunters focused on money or CVEs and pressure to\u0000review are the most challenging to overcome. Surprisingly, lack of\u0000communication with bug hunters is the least challenging, and CVE creation\u0000support is the second-least helpful feature for OSS maintainers when reviewing\u0000bug bounty reports. We present recommendations to make the bug bounty review\u0000process more accommodating to open-source maintainers and identify areas for\u0000future work.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Timothy Huo, Ana Catarina Araújo, Jake Imanaka, Anthony Peruma, Rick Kazman
{"title":"Mobile App Security Trends and Topics: An Examination of Questions From Stack Overflow","authors":"Timothy Huo, Ana Catarina Araújo, Jake Imanaka, Anthony Peruma, Rick Kazman","doi":"arxiv-2409.07926","DOIUrl":"https://doi.org/arxiv-2409.07926","url":null,"abstract":"The widespread use of smartphones and tablets has made society heavily\u0000reliant on mobile applications (apps) for accessing various resources and\u0000services. These apps often handle sensitive personal, financial, and health\u0000data, making app security a critical concern for developers. While there is\u0000extensive research on software security topics like malware and\u0000vulnerabilities, less is known about the practical security challenges mobile\u0000app developers face and the guidance they seek. rev{In this study, we mine\u0000Stack Overflow for questions on mobile app security, which we analyze using\u0000quantitative and qualitative techniques.} The findings reveal that Stack\u0000Overflow is a major resource for developers seeking help with mobile app\u0000security, especially for Android apps, and identifies seven main categories of\u0000security questions: Secured Communications, Database, App Distribution Service,\u0000Encryption, Permissions, File-Specific, and General Security. Insights from\u0000this research can inform the development of tools, techniques, and resources by\u0000the research and vendor community to better support developers in securing\u0000their mobile apps.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building a Cybersecurity Risk Metamodel for Improved Method and Tool Integration","authors":"Christophe Ponsard","doi":"arxiv-2409.07906","DOIUrl":"https://doi.org/arxiv-2409.07906","url":null,"abstract":"Nowadays, companies are highly exposed to cyber security threats. In many\u0000industrial domains, protective measures are being deployed and actively\u0000supported by standards. However the global process remains largely dependent on\u0000document driven approach or partial modelling which impacts both the efficiency\u0000and effectiveness of the cybersecurity process from the risk analysis step. In\u0000this paper, we report on our experience in applying a model-driven approach on\u0000the initial risk analysis step in connection with a later security testing. Our\u0000work rely on a common metamodel which is used to map, synchronise and ensure\u0000information traceability across different tools. We validate our approach using\u0000different scenarios relying domain modelling, system modelling, risk assessment\u0000and security testing tools.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}