Journal of Systems and Software最新文献

筛选
英文 中文
Managing security issues in software containers: From practitioners’ perspective 管理软件容器中的安全问题:从从业者的角度
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-09 DOI: 10.1016/j.jss.2025.112616
Maha Sroor , Rahul Mohanani , Ricardo Colomo-Palacios , Sandun Dasanayake , Tommi Mikkonen
{"title":"Managing security issues in software containers: From practitioners’ perspective","authors":"Maha Sroor ,&nbsp;Rahul Mohanani ,&nbsp;Ricardo Colomo-Palacios ,&nbsp;Sandun Dasanayake ,&nbsp;Tommi Mikkonen","doi":"10.1016/j.jss.2025.112616","DOIUrl":"10.1016/j.jss.2025.112616","url":null,"abstract":"<div><div>Software development industries are increasingly adopting containers to enhance the scalability and flexibility of applications. Security in containerized projects is a critical challenge that can lead to data breaches and performance degradation, thereby directly affecting the reliability and operations of the container services. Despite the ongoing effort to manage the security issues in containerized projects in SE research, more investigations are needed to explore the human perspective of security management in containerized projects. This research aims to explore security management in containerized projects by exploring how SE practitioners manage the security issues in containerized projects. A clear understanding of security management in containerized projects will enable industries to develop robust security strategies that enhance software reliability and trust. To achieve this, we conducted two semi-structured interview studies to examine how practitioners approach security management. The first study focused on practitioners’ perceptions of security challenges in containerized environments, where we interviewed 15 participants between December 2022 and October 2023. The second study explored how to address security issues, with 20 participants interviewed between October 2024 and December 2024. Data analysis reveals how SE practitioners address the various security challenges in containerized projects. Our analysis also identified the technical and non-technical enablers that can be utilized to enhance security in containerized projects. Overall, we propose a conceptual model that visualizes how practitioners manage security issues in containerized projects. We argue that our proposed model will guide practitioners in making informed decisions to plan, develop, and deploy secure container systems.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112616"},"PeriodicalIF":4.1,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing reliability in LLM-integrated robotic systems: A unified approach to security and safety 增强llm集成机器人系统的可靠性:安全与安全的统一方法
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-08 DOI: 10.1016/j.jss.2025.112614
Wenxiao Zhang, Xiangrui Kong, Conan Dewitt, Thomas Bräunl, Jin B. Hong
{"title":"Enhancing reliability in LLM-integrated robotic systems: A unified approach to security and safety","authors":"Wenxiao Zhang,&nbsp;Xiangrui Kong,&nbsp;Conan Dewitt,&nbsp;Thomas Bräunl,&nbsp;Jin B. Hong","doi":"10.1016/j.jss.2025.112614","DOIUrl":"10.1016/j.jss.2025.112614","url":null,"abstract":"<div><div>Integrating Large Language Models (LLMs) into robotic systems has revolutionised embodied artificial intelligence, enabling advanced decision-making and adaptability. However, ensuring reliability — encompassing both security against adversarial attacks and safety in complex environments — remains a critical challenge. To address this, we propose a unified framework that mitigates prompt injection attacks while enforcing operational safety through robust validation mechanisms. Our approach combines prompt assembling, state management, and safety validation, evaluated using both performance and security metrics. Experiments show a 30.8% improvement under injection attacks and up to a 325% improvement in complex environment settings under adversarial conditions compared to baseline scenarios. This work bridges the gap between safety and security in LLM-based robotic systems, offering actionable insights for deploying reliable LLM-integrated mobile robots in real-world settings. The framework is open-sourced with simulation and physical deployment demos at <span><span>https://llmeyesim.vercel.app/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112614"},"PeriodicalIF":4.1,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HarmonyCAS: A model-driven framework for facilitating interoperability in context-aware systems HarmonyCAS:一个模型驱动的框架,用于促进上下文感知系统中的互操作性
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-05 DOI: 10.1016/j.jss.2025.112611
Hamed Barangi , Shekoufeh Kolahdouz Rahimi , Bahman Zamani , Hossein Moradi
{"title":"HarmonyCAS: A model-driven framework for facilitating interoperability in context-aware systems","authors":"Hamed Barangi ,&nbsp;Shekoufeh Kolahdouz Rahimi ,&nbsp;Bahman Zamani ,&nbsp;Hossein Moradi","doi":"10.1016/j.jss.2025.112611","DOIUrl":"10.1016/j.jss.2025.112611","url":null,"abstract":"<div><div>The growing adoption of Context-Aware Systems (CAS), driven by advancements in ubiquitous computing and the Internet of Things (IoT), has heightened the need for seamless context sharing across heterogeneous domains. However, achieving interoperability remains a significant challenge, particularly in addressing syntactic (e.g., data format conflicts) and semantic (e.g., differing conceptual definitions) issues. This paper introduces HarmonyCAS, a model-driven framework designed to simplify the development of publish/subscribe middleware and enhance CAS interoperability. HarmonyCAS incorporates a Domain-Specific Language (DSL) for modeling publish/subscribe middleware components, CAS, and syntactic/semantic mappings, along with a code generation tool that automatically generates middleware code. Evaluation results demonstrate that HarmonyCAS delivers robust performance, achieving a 0 % error rate at 2000 messages and maintaining scalable responsiveness with an average response time of 49 ms at 5000 messages. Additionally, usability surveys indicate high user satisfaction, with mean scores exceeding 4 out of 5. These findings confirm the framework’s effectiveness in facilitating seamless context exchange while meeting key interoperability requirements.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112611"},"PeriodicalIF":4.1,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145026940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-based dependability and performance analysis for satellite systems with collaborative maintenance maneuvers via stochastic games 基于随机博弈的卫星系统协同维护机动可靠性与性能分析
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-05 DOI: 10.1016/j.jss.2025.112610
Abdelhakim Baouya , Brahim Hamid , Otmane Ait Mohamed , Saddek Bensalem
{"title":"Model-based dependability and performance analysis for satellite systems with collaborative maintenance maneuvers via stochastic games","authors":"Abdelhakim Baouya ,&nbsp;Brahim Hamid ,&nbsp;Otmane Ait Mohamed ,&nbsp;Saddek Bensalem","doi":"10.1016/j.jss.2025.112610","DOIUrl":"10.1016/j.jss.2025.112610","url":null,"abstract":"<div><div>GPS Standard Positioning Service (SPS) relies on orbiting satellites to provide accurate time, location, and altitude information under all weather conditions, day or night, anywhere in the world. The lifespan of these satellites can vary depending on the specific version and the dependability reference parameters. Engineers rely on dependability references to assess Reliability, Availability, and Maintainability (RAM) during the design phase, to maximize a satellite’s Mean Time Between Failures (MTBF). Furthermore, integrity and continuity are performance metrics as they directly impact the RAM properties and trustworthiness of the positioning, navigation, and services provided to users. This paper proposes a formal and parametrizable model based on concurrent stochastic games (CSG) to represent satellite systems with collaborative maintenance maneuvers. The model incorporates formal specifications of dependability and performance, expressed in rPATL. Model parameters are derived from SPS standard characteristics established by the Space Operations Squadron to ensure the health and status of the operational constellation. Through the PRISM-games model checker, we conduct a quantitative analysis of collaborative behaviors between players in orbit and on the ground. We demonstrate the advantages of the CSG model through previous experiences and attempts to achieve efficient maintenance. Our findings shed light on the trade-offs in maintenance operations distributed between ground-based and orbital-based maintainers.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112610"},"PeriodicalIF":4.1,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145105454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Syntactic multilingual probing of pre-trained language models of code 预训练代码语言模型的多语言句法探索
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-03 DOI: 10.1016/j.jss.2025.112604
José Antonio Hernández López , Martin Weyssow , Jesús Sánchez Cuadrado , Houari Sahraoui
{"title":"Syntactic multilingual probing of pre-trained language models of code","authors":"José Antonio Hernández López ,&nbsp;Martin Weyssow ,&nbsp;Jesús Sánchez Cuadrado ,&nbsp;Houari Sahraoui","doi":"10.1016/j.jss.2025.112604","DOIUrl":"10.1016/j.jss.2025.112604","url":null,"abstract":"<div><div>Pre-trained language models (PLMs) have demonstrated remarkable abilities in coding tasks, establishing themselves as a state-of-the-art technique in machine learning for code. However, due to their deep neural network-based structure, PLMs function as black-box systems, making it crucial to understand the types of information they actually learn. Recent studies indicate that PLMs possess cross-lingual capabilities, allowing them to generalize to unseen programming languages and outperform monolingual models when trained in a multilingual setting. Nonetheless, the reasons behind these cross-lingual abilities remain largely uncharted and remain open questions.</div><div>In this paper, we explore this phenomenon through a syntactic perspective. Specifically, we build on our prior work, the AST-Probe, a probing methodology that evaluates whether a PLM encodes the complete grammatical structure of a programming language. This probe identifies a <em>syntactic subspace</em> within the PLM’s vector representations, which is then used to reconstruct ASTs. We extend this approach in two ways. First, we conducted experiments on eight programming languages and eight PLMs and found that: (1) this syntactic structure can be extracted in all cases, (2) CodeBERT and GraphCodeBERT excel at encoding ASTs, and (3) syntactic knowledge resides in the middle layers of all PLMs, with a distribution that is independent of the programming language. Secondly, we mathematically adapt the AST-Probe to a multilingual setting and apply it to CodeBERT. Our findings provide evidence that CodeBERT learns cross-lingual representations of programming languages syntax.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112604"},"PeriodicalIF":4.1,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144932898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Defect prediction guided greybox fuzz testing 缺陷预测指导灰盒模糊测试
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-02 DOI: 10.1016/j.jss.2025.112609
Haochen Jin , Zhanqi Cui , Ruichen Zhang , Xiang Chen , Rongcun Wang , Xiulei Liu
{"title":"Defect prediction guided greybox fuzz testing","authors":"Haochen Jin ,&nbsp;Zhanqi Cui ,&nbsp;Ruichen Zhang ,&nbsp;Xiang Chen ,&nbsp;Rongcun Wang ,&nbsp;Xiulei Liu","doi":"10.1016/j.jss.2025.112609","DOIUrl":"10.1016/j.jss.2025.112609","url":null,"abstract":"<div><div>Fuzz testing is an established automated technique for detecting defects in software by generating test cases randomly or semi-randomly. The escalating complexity of software functionalities makes comprehensive testing more arduous. Research indicates that the distribution of defects in software often exhibits a clustering effect. By predicting the distribution of defects in software based on static properties of code, historical data, or other information of software, defect prediction optimizes the allocation of testing resources, particularly improving the effectiveness of fuzz testing tools by focusing on testing modules with higher defect proneness. To this end, this paper introduces a greybox fuzz testing approach named DPFuzz, which integrates the advantages of defect prediction and fuzz testing to generate test cases with increased specificity for detecting defects. As an extension plugin for fuzz testing tools, DPFuzz can seamlessly integrate with various fuzzing tools. Experiments are conducted on 12 open-source software under test, and the results demonstrate that DPFuzz effectively improves the performance of fuzz testing, which is evidenced by an increase of up to 18.6% in the number of unique crashes triggered, the study validates the correlation between defects detected by fuzz testing and defect prediction results, affirming the practical application value of defect prediction.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112609"},"PeriodicalIF":4.1,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145010593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A checklist of quality concerns for architecting ML-intensive systems 构建ml密集型系统的质量关注清单
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-02 DOI: 10.1016/j.jss.2025.112612
Alessio Bucaioni , Rick Kazman , Patrizio Pelliccione
{"title":"A checklist of quality concerns for architecting ML-intensive systems","authors":"Alessio Bucaioni ,&nbsp;Rick Kazman ,&nbsp;Patrizio Pelliccione","doi":"10.1016/j.jss.2025.112612","DOIUrl":"10.1016/j.jss.2025.112612","url":null,"abstract":"<div><h3>Background:</h3><div>Machine learning components are being deployed across nearly every business sector and their importance is continually growing. However, the engineering practices for building these systems remain poorly understood compared to those for conventional software systems.</div></div><div><h3>Objective:</h3><div>This work provides practical guidance to support architects in designing and implementing machine learning-intensive systems, and identifies areas where there are gaps in understanding and achievement.</div></div><div><h3>Method:</h3><div>Building on our prior research, we developed a checklist of quality concerns for architects of machine learning-intensive systems. This checklist was iteratively refined through expert interviews and subsequently validated in a workshop with experienced architects.</div></div><div><h3>Results:</h3><div>The main result of this work is a comprehensive list of 40 checks, organized into two main categories and 16 subcategories. Also, we present the results of a workshop where the importance and degree of achievement of each check was assessed by 25 practicing architects of ML-intensive systems.</div></div><div><h3>Conclusion:</h3><div>The findings of this study contribute to a better understanding of the unique challenges of ML-intensive systems and offer initial guidance to practitioners, and researchers on areas where future work should be directed. The findings of this study offer valuable support to architects in addressing the unique challenges of ML-intensive systems and provide guidance to practitioners and researchers in terms of where future work should be focused.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112612"},"PeriodicalIF":4.1,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144988185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An empirical study on the impact of change granularity in refactoring detection 变更粒度对重构检测影响的实证研究
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-09-01 DOI: 10.1016/j.jss.2025.112608
Lei Chen, Shinpei Hayashi
{"title":"An empirical study on the impact of change granularity in refactoring detection","authors":"Lei Chen,&nbsp;Shinpei Hayashi","doi":"10.1016/j.jss.2025.112608","DOIUrl":"10.1016/j.jss.2025.112608","url":null,"abstract":"<div><div>Detecting refactorings in commit history is essential to improve comprehension to code changes on code reviews, and to provide valuable information for empirical studies on software evolution. Techniques have been proposed to accurately detect refactorings on the granularity of a single commit. However, refactorings can be made over multiple commits because of their complexity or other practical development problems, which cause detecting on only the granularity of a single commit not enough. We observe that some refactorings can only be detected in coarser granularity, i.e., changes conducted over multiple commits, or in the granularity of a single commit but not in coarse-grained. We call these types of refactorings as <em>coarse-grained refactorings</em> (CGRs) and <em>ephemeral refactorings</em> (EPRs). We investigated the features and causes of CGRs and EPRs through an empirical study of 32 open-source Java projects and found that both commonly occur during development. In addition, we found that refactoring types related to splitting or merging classes and packages, as well as those involving modifications to the inheritance structure, tend to be CGRs, and types targeting small objects such as variables and attributes, and refactorings with context-sensitive detection criteria tend to be EPRs. The causes of CGRs and EPRs are analyzed and categorized, and the relationships between the commit messages of CGRs and themselves are also assessed. We found that about 20% of commit messages explicitly suggest the existence of CGRs. We suggest that CGRs and EPRs be valued in refactoring research and that detectors be extended to identify CGRs.</div><div><em>Editor’s note: Open Science material was validated by the Journal of Systems and Software Open Science Board.</em></div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112608"},"PeriodicalIF":4.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145019197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preparing an R package for open-source contributions: An experience report on the World Wildlife Fund’s Forest Foresight 为开源贡献准备一个R包:关于世界自然基金会森林展望的经验报告
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-08-29 DOI: 10.1016/j.jss.2025.112597
Amin Bakhshi , Hasrul Maruf , Maas van Apeldoorn , Zillah Calle , Jonas van Duijvenbode , Ismay Wolff , Yanja Dajsuren , Jacob Krüger
{"title":"Preparing an R package for open-source contributions: An experience report on the World Wildlife Fund’s Forest Foresight","authors":"Amin Bakhshi ,&nbsp;Hasrul Maruf ,&nbsp;Maas van Apeldoorn ,&nbsp;Zillah Calle ,&nbsp;Jonas van Duijvenbode ,&nbsp;Ismay Wolff ,&nbsp;Yanja Dajsuren ,&nbsp;Jacob Krüger","doi":"10.1016/j.jss.2025.112597","DOIUrl":"10.1016/j.jss.2025.112597","url":null,"abstract":"<div><div>Deforestation (i.e., the removal or destruction of forests by humans), particularly illegal, is a major cause of ecological and environmental problems. To combat illegal deforestation, the World Wildlife Fund (WWF) has developed an open-source R package to predict deforestation around the world using machine learning. The package has been used by and customized to various countries, providing immense value for environmental protection. However, the package was implemented by domain experts without software engineering background, resulting in an unstructured development process, a monolithic codebase, and a lack of documentation on processes and code. Aiming to build an open-source community to improve and maintain the package, the WWF team decided to focus on enhancing the accessibility and attractiveness of the codebase for newcomers. Supporting this goal, we conducted an action-research-like project using Scrum to improve the code quality, tooling, testing, processes, and documentation while also establishing practices to sustain and build upon these improvements. In this article, we describe this project and share our insights into opening an R package to make it more accessible for external open-source contributors. Our insights include guidance on communicating design decisions to domain experts without a software engineering background and on how to train them in software engineering practices. Further insights highlight the specific challenges of working with R packages. Lastly, our work showcases the contributions that software engineering can make to support environmental protection and can guide future projects in this direction.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112597"},"PeriodicalIF":4.1,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144932899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FIN: Boosting binary code embedding by normalizing function inlinings FIN:通过规范函数内联来增强二进制代码嵌入
IF 4.1 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-08-29 DOI: 10.1016/j.jss.2025.112603
Mohammadhossein Amouei , Benjamin C.M. Fung , Philippe Charland
{"title":"FIN: Boosting binary code embedding by normalizing function inlinings","authors":"Mohammadhossein Amouei ,&nbsp;Benjamin C.M. Fung ,&nbsp;Philippe Charland","doi":"10.1016/j.jss.2025.112603","DOIUrl":"10.1016/j.jss.2025.112603","url":null,"abstract":"<div><div>Binary code similarity detection (BCSD) is essential for identifying similar code sections across different programs, regardless of their source languages, compilation options, or underlying architectures. It plays a crucial role in areas such as code plagiarism detection, malware analysis, and vulnerability discovery. However, BCSD faces significant challenges due to compiler optimizations, such as function inlining, which alter the binary structure. Existing rule-based function control flow graph (CFG) expansion strategies have limited success, due to low precision and recall in identifying inlined call sites. In this study, we present a detailed investigation of function inlining and propose an AI-driven solution to expand CFGs, offering improvements for BCSD approaches. We designed a set of features for a machine learning algorithm to identify functions at O0 and O1 optimizations that may be inlined at the higher optimizations O2 and O3, without prior knowledge of the optimization level. By utilizing this information to expand function CFGs, we observed significant enhancements in the performance of state-of-the-art binary code representation learning techniques. Experimental results show that our proposed method increases the effectiveness of representation learning approaches by up to 21.54%. Additionally, our experiments show that our proposed method can improve true positive rate in identifying known vulnerabilities.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"231 ","pages":"Article 112603"},"PeriodicalIF":4.1,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144913965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信