Information and Software Technology最新文献

筛选
英文 中文
SCOPE: Hybrid optimization strategy for higher-order mutation-based fault localization 研究范围:基于高阶突变的故障定位的混合优化策略
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-20 DOI: 10.1016/j.infsof.2025.107873
Hengyuan Liu , Zheng Li , Xiaolan Kang , Shumei Wu , Doyle Paul , Xiang Chen , Yong Liu
{"title":"SCOPE: Hybrid optimization strategy for higher-order mutation-based fault localization","authors":"Hengyuan Liu ,&nbsp;Zheng Li ,&nbsp;Xiaolan Kang ,&nbsp;Shumei Wu ,&nbsp;Doyle Paul ,&nbsp;Xiang Chen ,&nbsp;Yong Liu","doi":"10.1016/j.infsof.2025.107873","DOIUrl":"10.1016/j.infsof.2025.107873","url":null,"abstract":"<div><h3>Context:</h3><div>Mutation-Based Fault Localization (MBFL) using Higher-Order Mutants (HOM) has achieved promising performance in multiple-fault programs by simulating more realistic faults. Despite its effectiveness, it can be extremely costly due to the execution of numerous HOMs. However, existing cost-optimization strategies mainly focus on first-order mutants (FOMs), without considering the dependency relationships between HOMs and multiple program entities.</div></div><div><h3>Objective:</h3><div>In this article, we propose a novel strategy called <em>S</em>mart <em>C</em>ost-<em>O</em>ptimization through dynamic <em>P</em>rediction and sampling <em>E</em>xecution (SCOPE). It aims to reduce costs while providing rich mutation analysis information.</div></div><div><h3>Methods:</h3><div>SCOPE contains two key components: a Smart HOM Sampler and a Mutant-Testing Predictor. The former pre-selects the most promising HOMs for each program entity to execute, based on their association with suspicious program entities. The latter employs machine learning to infer the impact of the remaining HOMs on tests using test execution data from selected HOMs, without the need for actual execution.</div></div><div><h3>Results:</h3><div>(1) SCOPE outperforms state-of-the-art optimization strategies, including SELECTIVE, SAMPLING, and PMT, regardless of sampling rate or MBFL formulas adopted. (2) SCOPE can reduce the number of involved HOMs by up to 90% without any loss in the performance of MBFL. (3) SCOPE outperforms baseline methods including SBFL, three optimized MBFL techniques (WSOME, SGS, HMBFL) and two deep learning-based fault localization techniques (CNNFL and RNNFL). (4) Ablation Experiment validates that the Smart HOM Sampler and the Mutant-Testing Predictor contribute positively to the effectiveness of SCOPE, with average improvements of 23.60% and 15.14% in <span><math><mrow><mi>T</mi><mi>O</mi><mi>P</mi></mrow></math></span>-1 and <span><math><mi>A</mi></math></span>-<span><math><mrow><mi>E</mi><mi>X</mi><mi>A</mi><mi>M</mi></mrow></math></span>. Additionally, machine learning model comparison for the Mutant-Testing Predictor reveals that compared to the Logistic Regression and Naive Bayes, Random Forest has better prediction performance.</div></div><div><h3>Conclusions:</h3><div>Evaluation on 135 real-world multiple-fault programs from the widely used benchmark Defects4J have shown the effectiveness of our proposed hybrid optimization strategy SCOPE for higher-order mutation-based fault localization.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"188 ","pages":"Article 107873"},"PeriodicalIF":4.3,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144887322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Community Tapestry: An actionable tool to track turnover and diversity in OSS 社区挂毯:一个可操作的工具,用于跟踪OSS的周转和多样性
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-19 DOI: 10.1016/j.infsof.2025.107871
Mariam Guizani , Zixuan Feng , Emily Judith Arteaga Garcia , Katie Kimura , Diane Mueller , Luis Cañas Díaz , Alexander Serebrenik , Anita Sarma
{"title":"Community Tapestry: An actionable tool to track turnover and diversity in OSS","authors":"Mariam Guizani ,&nbsp;Zixuan Feng ,&nbsp;Emily Judith Arteaga Garcia ,&nbsp;Katie Kimura ,&nbsp;Diane Mueller ,&nbsp;Luis Cañas Díaz ,&nbsp;Alexander Serebrenik ,&nbsp;Anita Sarma","doi":"10.1016/j.infsof.2025.107871","DOIUrl":"10.1016/j.infsof.2025.107871","url":null,"abstract":"<div><h3>Context:</h3><div>A healthy open-source software (OSS) community is one that has a diverse contributor base and is sustainable by retaining its contributors. Project leaders, therefore, must understand their community’s turnover and diversity makeup.</div></div><div><h3>Objectives:</h3><div>This study aims to investigate how to support project leaders in monitoring OSS community health. Specifically, we examine the role of an interactive dashboard in enhancing awareness of contributor turnover and diversity.</div></div><div><h3>Methods:</h3><div>We designed and developed <strong>Community Tapestry</strong>, a dynamic, daily-updated dashboard, using <strong>Participatory Design (PD)</strong> sessions with stakeholders from the Apache Software Foundation (ASF), Community Health Analytics in Open Source Software (CHAOSS), and Bitergia Analytics. We initially evaluated Community Tapestry by engaging contributors from our PD partners’ OSS projects. To further validate our findings, we conducted a confirmatory study with a prominent OSS project under the Cloud Native Computing Foundation (CNCF). Contributors from both projects explored a personalized version of the dashboard that uses their own up-to-date project data.</div></div><div><h3>Results:</h3><div>Our results demonstrate that Community Tapestry enhanced participants’ awareness of their community’s turnover and diversity state. It enabled them to identify areas for improvement and provided actionable insights to foster a more inclusive and stable community.</div></div><div><h3>Conclusion:</h3><div>Community Tapestry offers OSS project leaders an actionable approach to monitor turnover and diversity state, enabling data-driven governance and fostering more inclusive and sustainable communities. Our PD approach provides practical insights into how community-driven interventions can be developed and adopted.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"188 ","pages":"Article 107871"},"PeriodicalIF":4.3,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144893697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gamified or Glorified? A systematic review of serious games for security & privacy in the SDLC 游戏化还是荣耀化?在SDLC中对严肃游戏的安全性和隐私性进行系统回顾
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-19 DOI: 10.1016/j.infsof.2025.107850
Jonah Bellemans, Dimitri Van Landuyt, Laurens Sion, Lieven Desmet
{"title":"Gamified or Glorified? A systematic review of serious games for security & privacy in the SDLC","authors":"Jonah Bellemans,&nbsp;Dimitri Van Landuyt,&nbsp;Laurens Sion,&nbsp;Lieven Desmet","doi":"10.1016/j.infsof.2025.107850","DOIUrl":"10.1016/j.infsof.2025.107850","url":null,"abstract":"<div><h3>Context:</h3><div>While security and privacy are playing increasingly important roles in the software development process, the skill shortage for security and privacy keeps growing. To address this, academia and industry alike have proposed game-based approaches, to foster involvement of non-expert stakeholders, and to improve collaboration among the various parties involved. However, research has shown that injudicious implementation of gamification can do more harm than good. Basing the game design on an existing, established methodology is crucial to accomplish the intended goals of the game.</div></div><div><h3>Objective:</h3><div>This paper identifies and compares the different serious games in the space of security and privacy engineering. It highlights the differences between games in goals, intent, form, and approach, and pays particular attention to (1) the specific motivations behind the selected gameful design elements, and (2) the scientific evidence of the benefits of these game-based approaches.</div></div><div><h3>Method:</h3><div>We perform a widely-scoped discovery search for relevant serious games to establish a dataset of relevant games. For each of the in total twelve games, we collect and study the different game artifacts, covered CyBOK knowledge domains, materials, research articles, and practitioner testimonials.</div></div><div><h3>Results:</h3><div>Most games target a multi-stakeholder industry practitioner audience, typically with the goal of providing a first introduction to activities such as Requirements Engineering and Threat Modeling. The majority of games have been designed in an ad-hoc manner, rather than being based on design frameworks or methodologies. Scientific evaluations of these games mostly focus on obtaining participant feedback, experiences and opinions, rather than evaluating the actual outcomes of applying the game.</div></div><div><h3>Conclusions:</h3><div>While game-based approaches for security and privacy in the SDLC are showing promise, many of them have not been designed with proven serious game design frameworks or methodologies. Further empirical evidence is required to confirm the effectiveness of these games.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"188 ","pages":"Article 107850"},"PeriodicalIF":4.3,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144893696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SemiSMAC: A semi-supervised framework for log anomaly detection with automated hyperparameter tuning 一个半监督框架,用于自动超参数调优的日志异常检测
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-16 DOI: 10.1016/j.infsof.2025.107869
Yicheng Sun , Jacky Wai Keung , Zhen Yang , Shuo Liu , Yihan Liao
{"title":"SemiSMAC: A semi-supervised framework for log anomaly detection with automated hyperparameter tuning","authors":"Yicheng Sun ,&nbsp;Jacky Wai Keung ,&nbsp;Zhen Yang ,&nbsp;Shuo Liu ,&nbsp;Yihan Liao","doi":"10.1016/j.infsof.2025.107869","DOIUrl":"10.1016/j.infsof.2025.107869","url":null,"abstract":"<div><h3>Context:</h3><div>Logs generated during software operations are critical for system reliability and anomaly detection. However, their diversity, the scarcity of labeled data, and hyperparameter tuning challenges hinder traditional detection methods.</div></div><div><h3>Objective:</h3><div>This paper presents SemiSMAC, a novel semi-supervised framework that leverages the Large Language Model for log parsing and grouping, combined with Sequential Model-based Algorithm Configuration (SMAC) for hyperparameter optimization to enhance anomaly detection.</div></div><div><h3>Method:</h3><div>In this work, we leverage ChatGPT for log parsing and introduce a novel log grouping approach. This grouping process requires only a small number of labeled samples, which ChatGPT uses to generate pseudo-labels for the remaining data, thereby expanding the training set. Furthermore, SemiSMAC utilizes a Sequential Model-based Algorithm Configuration (SMAC) to automatically optimize the hyperparameters of the embedded models. This integration leads to consistent performance improvements, particularly in resource-constrained environments.</div></div><div><h3>Results:</h3><div>SemiSMAC-LSTM, which uses LSTM as the backbone of the SemiSMAC framework, demonstrates superior performance in experiments on four widely used datasets. It outperforms six benchmark models, including three supervised learning models. In low-resource scenarios, SemiSMAC-LSTM exhibits exceptional robustness, showcasing its effectiveness in handling challenging detection tasks.</div></div><div><h3>Conclusion:</h3><div>SemiSMAC demonstrates its potential to revolutionize anomaly detection in both large-scale and low-resource datasets. Its ability to deliver outstanding performance makes it a valuable tool for scalable and automated anomaly detection in real-world applications, paving the way for more reliable and scalable software engineering practices</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"187 ","pages":"Article 107869"},"PeriodicalIF":4.3,"publicationDate":"2025-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144867121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
POSVIA: Inconsistency analyzer for open-source Proof-of-Concept reports POSVIA:用于开源概念验证报告的不一致性分析器
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-16 DOI: 10.1016/j.infsof.2025.107868
Lingyan Ding , Xingya Wang , Zhenyu Chen , Song Huang
{"title":"POSVIA: Inconsistency analyzer for open-source Proof-of-Concept reports","authors":"Lingyan Ding ,&nbsp;Xingya Wang ,&nbsp;Zhenyu Chen ,&nbsp;Song Huang","doi":"10.1016/j.infsof.2025.107868","DOIUrl":"10.1016/j.infsof.2025.107868","url":null,"abstract":"<div><h3>Context:</h3><div>Proof-of-Concept (PoC) reports are indispensable for evaluating the exploitability of vulnerabilities. Various PoC data sources are responsible for collecting and sharing these reports. We have identified inconsistencies in the information pertaining to affected software versions across these data sources. These inconsistencies serve as red flags, alerting security experts to exercise caution during exploitability assessments and ensuring the effective allocation of resources.</div></div><div><h3>Objective:</h3><div>This paper analyzes software version inconsistencies in PoC reports and proposes “POSVIA” (<strong><u>P</u></strong>oC <strong><u>O</u></strong>riented <strong><u>S</u></strong>oftware <strong><u>V</u></strong>ersion <strong><u>I</u></strong>nconsistency <strong><u>A</u></strong>nalyzer), a deep learning tool designed to automatically detect and evaluate these inconsistencies across multiple PoC data sources, overcoming the impracticality of manual detection.</div></div><div><h3>Methods:</h3><div>A Named Entity Recognition (NER) model was developed with high performance: precision (93.76%) and recall (93.48%) for extracting CVE IDs, affected software names, and version data from PoC reports. Additionally, a Relation Extraction (RE) model was designed with metrics of 95.04% precision and 96.40% recall, to identify relationships between software and versions. These models analyzed 173,239 PoC reports from four data sources and assessed version inconsistencies using “POSVIA”.</div></div><div><h3>Results:</h3><div>Analysis revealed that Openwall had the lowest strict match rate (32.75%) for affected software versions, compared to other sources. The strict match rate for verified software versions ranged from 60.00% to 78.16%, indicating substantial inconsistencies. Over time, the match rate fluctuated, improving when using ExploitDB, Packet Storm Security, and CXSecurity as benchmarks. Openwall’s rate remained low, suggesting it should be considered alongside other sources for vulnerability exploitability assessments.</div></div><div><h3>Conclusion:</h3><div>This study introduces an automated tool named “POSVIA”, which is designed to address the challenge of detecting inconsistencies in software versions within PoC reports. By automating inconsistency detection across multiple data sources, POSVIA overcomes the limitations of manual methods and enhances the accuracy of exploitability assessments. This approach provides critical support for improving software security and resource allocation.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"188 ","pages":"Article 107868"},"PeriodicalIF":4.3,"publicationDate":"2025-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144879076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Naming the Pain in machine learning-enabled systems engineering 命名机器学习系统工程中的痛点
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-10 DOI: 10.1016/j.infsof.2025.107866
Marcos Kalinowski , Daniel Mendez , Görkem Giray , Antonio Pedro Santos Alves , Kelly Azevedo , Tatiana Escovedo , Hugo Villamizar , Helio Lopes , Teresa Baldassarre , Stefan Wagner , Stefan Biffl , Jürgen Musil , Michael Felderer , Niklas Lavesson , Tony Gorschek
{"title":"Naming the Pain in machine learning-enabled systems engineering","authors":"Marcos Kalinowski ,&nbsp;Daniel Mendez ,&nbsp;Görkem Giray ,&nbsp;Antonio Pedro Santos Alves ,&nbsp;Kelly Azevedo ,&nbsp;Tatiana Escovedo ,&nbsp;Hugo Villamizar ,&nbsp;Helio Lopes ,&nbsp;Teresa Baldassarre ,&nbsp;Stefan Wagner ,&nbsp;Stefan Biffl ,&nbsp;Jürgen Musil ,&nbsp;Michael Felderer ,&nbsp;Niklas Lavesson ,&nbsp;Tony Gorschek","doi":"10.1016/j.infsof.2025.107866","DOIUrl":"10.1016/j.infsof.2025.107866","url":null,"abstract":"<div><h3>Context:</h3><div>Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes.</div></div><div><h3>Objective:</h3><div>This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research.</div></div><div><h3>Method:</h3><div>We conducted an international survey to collect insights from practitioners on the current practices and problems in engineering ML-enabled systems. We received 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems using open and axial coding procedures.</div></div><div><h3>Results:</h3><div>Our survey results reinforce and extend existing empirical evidence on engineering ML-enabled systems, providing additional insights into typical ML-enabled systems project contexts, the perceived relevance and complexity of ML life cycle phases, and current practices related to problem understanding, model deployment, and model monitoring. Furthermore, the qualitative analysis provides a detailed map of the problems practitioners face within each ML life cycle phase and the problems causing overall project failure.</div></div><div><h3>Conclusions:</h3><div>The results contribute to a better understanding of the status quo and problems in practical environments. We advocate for the further adaptation and dissemination of software engineering practices to enhance the engineering of ML-enabled systems.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"187 ","pages":"Article 107866"},"PeriodicalIF":4.3,"publicationDate":"2025-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144809939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Software Defect Prediction evaluation: New metrics based on the ROC curve 软件缺陷预测评估:基于ROC曲线的新度量
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-09 DOI: 10.1016/j.infsof.2025.107865
Luigi Lavazza, Sandro Morasca, Gabriele Rotoloni
{"title":"Software Defect Prediction evaluation: New metrics based on the ROC curve","authors":"Luigi Lavazza,&nbsp;Sandro Morasca,&nbsp;Gabriele Rotoloni","doi":"10.1016/j.infsof.2025.107865","DOIUrl":"10.1016/j.infsof.2025.107865","url":null,"abstract":"<div><h3>Context:</h3><div>ROC (Receiver Operating Characteristic) curves are widely used to represent how well fault-proneness models (e.g., probability models) classify software modules as faulty or non-faulty. <em>AUC</em>, the Area Under the ROC Curve, is usually used to quantify the overall discriminating power of a fault-proneness model. Alternative indicators proposed, e.g., <em>RRA</em> (Ratio of Relevant Areas), consider the area under a portion of a ROC curve. Each point of a ROC curve represents a binary classifier, obtained by setting a specified threshold on the fault-proneness model. Several performance metrics (Precision, Recall, the F-score, etc.) are used to assess a binary classifier.</div></div><div><h3>Objectives:</h3><div>We investigate the relationships linking “under the ROC curve area” indicators such as <em>AUC</em> and <em>RRA</em> to performance metrics.</div></div><div><h3>Methods:</h3><div>We study these relationships analytically. We introduce iso-PM ROC curves, whose points have the same value <span><math><mover><mrow><mi>P</mi><mi>M</mi></mrow><mo>¯</mo></mover></math></span> for a given performance metric PM. When evaluating a ROC curve, we identify the iso-PM curve with the same value of <em>AUC</em> or <em>RRA</em>. Its <span><math><mover><mrow><mi>P</mi><mi>M</mi></mrow><mo>¯</mo></mover></math></span> can be seen as a property of the ROC curve and fault-proneness model under evaluation.</div></div><div><h3>Results:</h3><div>There is an S-shaped relationship between <span><math><mover><mrow><mi>P</mi><mi>M</mi></mrow><mo>¯</mo></mover></math></span> and <em>AUC</em> for performance metrics that do not depend on the proportion <span><math><mi>ρ</mi></math></span> of faulty modules, i.e., dataset balancedness. <span><math><mi>ϕ</mi></math></span> (Matthews Correlation Coefficient) depends on <span><math><mi>ρ</mi></math></span>: with very imbalanced datasets, <em>AUC</em> appears over-optimistic and <span><math><mi>ϕ</mi></math></span> over-pessimistic. <em>RRA</em> defines the region of interest in terms of <span><math><mi>ρ</mi></math></span>, so all performance metrics depend on <span><math><mi>ρ</mi></math></span>. <em>RRA</em> is related to performance metrics via S-shaped curves.</div></div><div><h3>Conclusion:</h3><div>Our proposal helps gain a better quantitative understanding of the goodness of a ROC curve, especially in practically relevant regions of interest. Also, showing a ROC curve and iso-PM curves provides an intuitive perception of the goodness of a fault-proneness model.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"187 ","pages":"Article 107865"},"PeriodicalIF":4.3,"publicationDate":"2025-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144826948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FVulPri: Fine-grained vulnerability prioritization based on BERT-BGRU and multiple indicators FVulPri:基于BERT-BGRU和多指标的细粒度漏洞优先级
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-07 DOI: 10.1016/j.infsof.2025.107853
Sixuan Wang, Dongjin Yu, Xiongjie Liang, Chen Huang
{"title":"FVulPri: Fine-grained vulnerability prioritization based on BERT-BGRU and multiple indicators","authors":"Sixuan Wang,&nbsp;Dongjin Yu,&nbsp;Xiongjie Liang,&nbsp;Chen Huang","doi":"10.1016/j.infsof.2025.107853","DOIUrl":"10.1016/j.infsof.2025.107853","url":null,"abstract":"<div><h3>Introduction:</h3><div>Extensive efforts have been made to mitigate the impact of software vulnerabilities on information security. The researchers aim to prioritize vulnerabilities after they are disclosed and then take remediation actions. However, existing methods have problems such as a low degree of automation, coarse-grained granularity and insufficient scoring indicators.</div></div><div><h3>Objectives:</h3><div>This paper aims to provide a new approach to vulnerability prioritization, bridging the existing shortcomings with a more comprehensive evaluation system, improving the automation of the process and providing fine-grained scoring.</div></div><div><h3>Methods:</h3><div>In this paper, we propose FVulPri, a fine-grained vulnerability prioritization method that ranks software vulnerabilities at the function-level for the first time. FVulPri employs the BERT-BGRU model to evaluate vulnerability severity, introduces a novel code learning approach to analyze vulnerability-related functions and integrates multiple indicators to provide a comprehensive assessment.</div></div><div><h3>Results:</h3><div>The experimental results show that FVulPri has a more reasonable distribution compared to the CVSS (Common Vulnerability Scoring System) scores, achieves an average of 69.06% effectiveness on newly added function-level metrics, and its ranking results show a stronger alignment with expert assessments than those of CVSS, effectively enhancing the quality of vulnerability prioritization.</div></div><div><h3>Conclusion:</h3><div>This paper presents a Fine-grained Vulnerability Prioritization Method that leverages BERT-BGRU and multiple indicators to assess 14 metrics across three dimensions, namely necessity, function level, and scope of impact, thereby improving the efficiency and quality of vulnerability prioritization.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"187 ","pages":"Article 107853"},"PeriodicalIF":4.3,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144810294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metamorphic testing for textual and visual entailment: A unified framework for model evaluation and explanation 文本和视觉蕴涵的变形测试:模型评估和解释的统一框架
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-08-07 DOI: 10.1016/j.infsof.2025.107855
Mingyue Jiang , Bintao Hu , Xiao-Yi Zhang
{"title":"Metamorphic testing for textual and visual entailment: A unified framework for model evaluation and explanation","authors":"Mingyue Jiang ,&nbsp;Bintao Hu ,&nbsp;Xiao-Yi Zhang","doi":"10.1016/j.infsof.2025.107855","DOIUrl":"10.1016/j.infsof.2025.107855","url":null,"abstract":"<div><h3>Context:</h3><div>Textual entailment (TE) and visual entailment (VE) serve as the basis for a broad spectrum of tasks in natural language processing and vision–language modeling. However, although being extensively studied, both TE and VE models exhibit several quality issues. Additionally, their black-box nature hampers the understanding of their behaviors, making it unclear why the model fails to correctly predict entailment relationships. Consequently, there is a pressing need for methods that can effectively evaluate and explain TE and VE models.</div></div><div><h3>Objective:</h3><div>This study aims to develop a unified approach for detecting and interpreting failures, in both TE and VE models.</div></div><div><h3>Methods:</h3><div>We propose a metamorphic testing-based approach for evaluating and explaining both TE and VE models. The central aspect of our approach lies in the proposed three metamorphic relations, which are generic to both TE and VE, and also preserve specific associations among relevant inputs. The proposed approach conducts metamorphic testing to detect failures in TE and VE models. When a failure is revealed, it further performs a post-hoc analysis within the relevant group of inputs to identify information that is critical for the detected failure.</div></div><div><h3>Results:</h3><div>Experimental results demonstrate the effectiveness of the proposed approach in failure detection and also confirm its potential to provide useful information to pinpoint the root causes of detected failures.</div></div><div><h3>Conclusion:</h3><div>This study presents a general metamorphic testing approach for both TE and VE. It also demonstrates that, with specifically designed metamorphic relations, metamorphic testing can serve as an effective basis for model explanation.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"187 ","pages":"Article 107855"},"PeriodicalIF":4.3,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144841594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CtrlFuzz: A controllable diffusion-based fuzz testing for deep neural networks via coverage-aware manifold guidance CtrlFuzz:基于覆盖感知流形制导的深度神经网络可控扩散模糊测试
IF 4.3 2区 计算机科学
Information and Software Technology Pub Date : 2025-07-30 DOI: 10.1016/j.infsof.2025.107856
Aoshuang Ye , Shilin Zhang , Runze Yan , Jianpeng Ke , Fei Zhu , Benxiao Tang
{"title":"CtrlFuzz: A controllable diffusion-based fuzz testing for deep neural networks via coverage-aware manifold guidance","authors":"Aoshuang Ye ,&nbsp;Shilin Zhang ,&nbsp;Runze Yan ,&nbsp;Jianpeng Ke ,&nbsp;Fei Zhu ,&nbsp;Benxiao Tang","doi":"10.1016/j.infsof.2025.107856","DOIUrl":"10.1016/j.infsof.2025.107856","url":null,"abstract":"<div><h3>Context:</h3><div>Deep neural networks (DNNs) have been extensively deployed in safety-critical applications. Nevertheless, the inherent vulnerability to subtle perturbations of inputs constitutes serious risks to the reliability of DNN-based systems. While mutation-based coverage-guided fuzzing (CGF) ensures test oracle through deliberately limited perturbations, it struggles to obtain diverse and sparse test cases. Conversely, generation-based CGF is able to create more diverse test cases aligned with data distribution but lacks precise controllability.</div></div><div><h3>Objective:</h3><div>To refine the controllability and effectiveness of CGF in DNN testing, we aim to design a framework that is capable of generating realistic test cases with fine-grained control, while systematically exploring model vulnerabilities through a manifold-aware coverage criterion.</div></div><div><h3>Method:</h3><div>In this paper, we propose <em>CtrlFuzz</em>, a manifold coverage-guided controllable diffusion framework for testing DNNs. CtrlFuzz leverages manifold learning to embed high-dimensional inputs into a lower-dimensional Euclidean space, preserving geometric structure. Based on this, we define a manifold coverage by quantifying the ratio between the distances from seed and the non-adversarial counterparts to class center. We further enhance the testing controllability via performing semantic decomposition on seed inputs. A customized diffusion model based on the U-Net structure integrates manifold coverage and semantic constraints into the denoising process, which allows to remain semantically natural while covering vulnerable regions.</div></div><div><h3>Results:</h3><div>Experimental results on four popular datasets and ten benchmark DNN architectures demonstrate that CtrlFuzz (1) effectively maintains the semantic coherence of generated test cases, (2) achieves improved exploration of vulnerable manifold regions compared to existing CGF techniques, and (3) discovers significantly more error-inducing inputs on multiple model types.</div></div><div><h3>Conclusion:</h3><div>CtrlFuzz introduces a novel manifold guiding and diffusion-based fuzzing for controllable test case synthesis. By enhancing both manifold coverage and controllability in CGF, CtrlFuzz improves the thoroughness and effectiveness of DNN testing, which offers a promising direction for future robustness evaluation frameworks.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"187 ","pages":"Article 107856"},"PeriodicalIF":4.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信