Journal of Systems and Software最新文献

筛选
英文 中文
Variational Prefix Tuning for diverse and accurate code summarization using pre-trained language models 使用预训练的语言模型进行多样化和准确的代码摘要的变分前缀调优
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-21 DOI: 10.1016/j.jss.2025.112493
Junda Zhao, Yuliang Song, Eldan Cohen
{"title":"Variational Prefix Tuning for diverse and accurate code summarization using pre-trained language models","authors":"Junda Zhao,&nbsp;Yuliang Song,&nbsp;Eldan Cohen","doi":"10.1016/j.jss.2025.112493","DOIUrl":"10.1016/j.jss.2025.112493","url":null,"abstract":"<div><div>Recent advancements in source code summarization have leveraged transformer-based pre-trained models, including Large Language Models of Code (LLMCs), to automate and improve the generation of code summaries. However, existing methods often focus on generating a single high-quality summary for a given source code, neglecting scenarios where the generated summary might be inadequate and alternative options are needed. In this paper, we introduce Variational Prefix Tuning (VPT), a novel approach that enhances pre-trained models’ ability to generate diverse yet accurate sets of summaries, allowing the user to choose the most suitable one for the given source code. Our method integrates a Conditional Variational Autoencoder (CVAE) framework as a modular component into pre-trained models, enabling us to model the distribution of observed target summaries and sample continuous embeddings to be used as prefixes to steer the generation of diverse outputs during decoding. Importantly, we construct our method in a parameter-efficient manner, eliminating the need for expensive model retraining, especially when using LLMCs. Furthermore, we employ a bi-criteria reranking method to select a subset of generated summaries, optimizing both the diversity and the accuracy of the options presented to users. We present extensive experimental evaluations using widely used datasets and current state-of-the-art pre-trained code summarization models to demonstrate the effectiveness of our approach and its adaptability across models.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"229 ","pages":"Article 112493"},"PeriodicalIF":3.7,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144130877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
reAnalyst: Scalable annotation of reverse engineering activities reAnalyst:逆向工程活动的可伸缩注释
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-15 DOI: 10.1016/j.jss.2025.112492
Tab (Tianyi) Zhang , Claire Taylor , Bart Coppens , Waleed Mebane , Christian Collberg , Bjorn De Sutter
{"title":"reAnalyst: Scalable annotation of reverse engineering activities","authors":"Tab (Tianyi) Zhang ,&nbsp;Claire Taylor ,&nbsp;Bart Coppens ,&nbsp;Waleed Mebane ,&nbsp;Christian Collberg ,&nbsp;Bjorn De Sutter","doi":"10.1016/j.jss.2025.112492","DOIUrl":"10.1016/j.jss.2025.112492","url":null,"abstract":"<div><div>This paper introduces reAnalyst, a framework designed to facilitate the study of reverse engineering (RE) practices through the semi-automated annotation of RE activities across various RE tools. By integrating tool-agnostic data collection of screenshots, keystrokes, active processes, and other types of data during RE experiments with semi-automated data analysis and generation of annotations, reAnalyst aims to overcome the limitations of traditional RE studies that rely heavily on manual data collection and subjective analysis. The framework enables more efficient data analysis, which will in turn allow researchers to explore the effectiveness of protection techniques and strategies used by reverse engineers more comprehensively and efficiently. Experimental evaluations validate the framework’s capability to identify RE activities from a diverse range of screenshots with varied complexities. Observations on past experiments with our framework as well as a survey among reverse engineers provide further evidence of the acceptability and practicality of our approach.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"229 ","pages":"Article 112492"},"PeriodicalIF":3.7,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144070114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dear researchers: Turning industry into a laboratory — The UnICo experience 亲爱的研究人员:把工业变成实验室——UnICo的经验
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-13 DOI: 10.1016/j.jss.2025.112495
Andrea Capiluppi
{"title":"Dear researchers: Turning industry into a laboratory — The UnICo experience","authors":"Andrea Capiluppi","doi":"10.1016/j.jss.2025.112495","DOIUrl":"10.1016/j.jss.2025.112495","url":null,"abstract":"<div><div><span><span><sup>1</sup></span></span> Academic researchers are also educators, and rightly so. Who better to teach than those at the forefront of their fields? Yet, dear researchers, there is a significant issue with this model: a disconnect from industrial realities. Industry often looks on with disbelief as you implement cutting-edge research using tools like the Eclipse IDE, an environment they abandoned years ago. To address this gap, we propose fostering academia-industry collaboration for capstone projects. By adopting an “Industry-as-a-Lab” approach, where real-world challenges guide research and education, you can equip students with industry-relevant skills and perhaps even gain new insights yourself. Using ASML as a case study, I illustrate how this model fosters meaningful learning, impactful research, and innovative solutions, effectively bridging the divide between academia and industry.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"228 ","pages":"Article 112495"},"PeriodicalIF":3.7,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144116826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fault localization of AI-enabled cyber–physical systems by exploiting temporal neuron activation 利用时间神经元激活的人工智能网络物理系统故障定位
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-12 DOI: 10.1016/j.jss.2025.112475
Deyun Lyu , Yi Li , Zhenya Zhang , Paolo Arcaini , Xiao-Yi Zhang , Fuyuki Ishikawa , Jianjun Zhao
{"title":"Fault localization of AI-enabled cyber–physical systems by exploiting temporal neuron activation","authors":"Deyun Lyu ,&nbsp;Yi Li ,&nbsp;Zhenya Zhang ,&nbsp;Paolo Arcaini ,&nbsp;Xiao-Yi Zhang ,&nbsp;Fuyuki Ishikawa ,&nbsp;Jianjun Zhao","doi":"10.1016/j.jss.2025.112475","DOIUrl":"10.1016/j.jss.2025.112475","url":null,"abstract":"<div><div>Modern <em>cyber–physical systems (CPS)</em> are evolving to integrate <em>deep neural networks (DNNs)</em> as controllers, leading to the emergence of <em>AI-enabled CPSs</em>. An inadequately trained DNN controller may produce incorrect control actions, exposing the system to safety risks. Therefore, it is crucial to localize the faulty neurons of the DNN controller responsible for the wrong decisions. However, since an unsafe system behavior typically arises from a sequence of control actions, establishing a connection between unsafe behaviors and faulty neurons is challenging. To address this problem, we propose <span>Tactical</span> that localizes faults in an AI-enabled CPS by exploiting <em>temporal neuron activation criteria</em> that capture temporal aspects of the DNN controller inferences. Specifically, based on testing results, for each neuron, <span>Tactical</span> constructs a <em>spectrum</em>, which considers the specification satisfaction and the evolution of the activation status of the neuron during the system execution. Then, starting from the spectra of all the neurons, <span>Tactical</span> applies suspiciousness metrics to compute a suspiciousness score for each neuron, from which the most suspicious ones are selected. We assess <span>Tactical</span> configured with eight <em>temporal neuron activation criteria</em>, on 3504 faulty AI-enabled CPS benchmarks spanning over different domains. The results show the effectiveness of <span>Tactical</span> w.r.t. a baseline approach.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"229 ","pages":"Article 112475"},"PeriodicalIF":3.7,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143942061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unboxing Default Argument Breaking Changes in 1 + 2 data science libraries 在1 + 2数据科学库中拆箱默认参数破坏更改
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-10 DOI: 10.1016/j.jss.2025.112460
João Eduardo Montandon , Luciana Lourdes Silva , Cristiano Politowski , Daniel Prates , Arthur de Brito Bonifácio , Ghizlane El Boussaidi
{"title":"Unboxing Default Argument Breaking Changes in 1 + 2 data science libraries","authors":"João Eduardo Montandon ,&nbsp;Luciana Lourdes Silva ,&nbsp;Cristiano Politowski ,&nbsp;Daniel Prates ,&nbsp;Arthur de Brito Bonifácio ,&nbsp;Ghizlane El Boussaidi","doi":"10.1016/j.jss.2025.112460","DOIUrl":"10.1016/j.jss.2025.112460","url":null,"abstract":"<div><div>Data Science (DS) has become a cornerstone for modern software, enabling data-driven decisions to improve companies services. Following modern software development practices, data scientists use third-party libraries to support their tasks. As the APIs provided by these tools often require an extensive list of arguments to be set up, data scientists rely on default values to simplify their usage. It turns out that these default values can change over time, leading to a specific type of breaking change, defined as Default Argument Breaking Change (DABC). This work reveals 93 DABCs in three Python libraries frequently used in Data Science tasks—Scikit Learn, NumPy, and Pandas—studying their potential impact on more than 500K client applications. We find out that the occurrence of DABCs varies significantly depending on the library; 35% of Scikit Learn clients are affected, while only 0.13% of NumPy clients are impacted. The main reason for introducing DABCs is to enhance API maintainability, but they often change the function’s behavior. We discuss the importance of managing DABCs in third-party DS libraries and provide insights for developers to mitigate the potential impact of these changes in their applications.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"229 ","pages":"Article 112460"},"PeriodicalIF":3.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143942060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BinCoFer: Three-stage purification for effective C/C++ binary third-party library detection 三级净化有效的C/ c++二进制第三方库检测
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-10 DOI: 10.1016/j.jss.2025.112480
Yayi Zou , Yixiang Zhang , Guanghao Zhao , Yueming Wu , Shuhao Shen , Cai Fu
{"title":"BinCoFer: Three-stage purification for effective C/C++ binary third-party library detection","authors":"Yayi Zou ,&nbsp;Yixiang Zhang ,&nbsp;Guanghao Zhao ,&nbsp;Yueming Wu ,&nbsp;Shuhao Shen ,&nbsp;Cai Fu","doi":"10.1016/j.jss.2025.112480","DOIUrl":"10.1016/j.jss.2025.112480","url":null,"abstract":"<div><div>Third-party libraries (TPL) are becoming increasingly popular to achieve efficient and concise software development. However, unregulated use of TPL will introduce legal and security issues in software development. Consequently, some studies have attempted to detect the reuse of TPLs in target programs by constructing a feature repository. Most of the works require access to the source code of TPLs, while the others suffer from redundancy in the repository, low detection efficiency, and difficulties in detecting partially referenced third-party libraries.</div><div>Therefore, we introduce BinCoFer, a tool designed for detecting TPLs reused in binary programs. We leverage the work of binary code similarity detection(BCSD) to extract binary-format TPL features, making it suitable for scenarios where the source code of TPLs is inaccessible. BinCoFer employs a novel three-stage purification strategy to mitigate feature repository redundancy by highlighting core functions and extracting function-level features, making it applicable to scenarios of partial reuse of TPLs. We have observed that directly using similarity threshold to determine the reuse between two binary functions is inaccurate, a problem that previous work has not addressed. Thus we design a method that uses weight to aggregate the similarity between functions in the target binary and core functions to ultimately judge the reuse situation with high frequency. To examine the ability of <em>BinCoFer</em>, we compiled a dataset on ArchLinux and conduct comparative experiments on it with other four most related works (<em>i.e., ModX</em>, <em>B2SFinder</em>, <em>LibAM</em> and <em>BinaryAI</em>). Through the experimental results, we find that <em>BinCoFer</em> outperforms them by over 20.0% in precision and 7.0% in F1. As the data volume increases, we observe the precision of BinCoFer tends to be stable and high. Moreover, <em>BinCoFer</em> greatly accelerates TPL detection efficiency which reduces the time cost of <em>ModX</em> by up to 99.7%.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"229 ","pages":"Article 112480"},"PeriodicalIF":3.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143946673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text–image fusion template for large language model assisted crowdsourcing test aggregation 大型语言模型辅助众包测试聚合的文本-图像融合模板
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-09 DOI: 10.1016/j.jss.2025.112478
Yunfeng Zhu, Shengcheng Yu, Zhaowei Zong, Yue Wang, Yuan Zhao, Zhenyu Chen
{"title":"Text–image fusion template for large language model assisted crowdsourcing test aggregation","authors":"Yunfeng Zhu,&nbsp;Shengcheng Yu,&nbsp;Zhaowei Zong,&nbsp;Yue Wang,&nbsp;Yuan Zhao,&nbsp;Zhenyu Chen","doi":"10.1016/j.jss.2025.112478","DOIUrl":"10.1016/j.jss.2025.112478","url":null,"abstract":"<div><div>Mobile crowdsourced testing leverages a varied group to enhance software quality through screenshots and text feedback. Examining the multitude of reports is tedious but crucial, often necessitating a combined analysis of both visual and textual information. However, professionals employ detailed judgment beyond mere similarity, which poses a challenge given the limited textual data and abundance of images in the reports.</div><div>We introduce a framework that guides large language models to handle missing data and inconsistencies in crowdsourced reports by using a triplet template <span><math><mrow><mo>〈</mo></mrow></math></span> Scene, Operation, Defect <span><math><mrow><mo>〉</mo></mrow></math></span> for bug identification. The framework leverages the element independence of the triplet for clustering ensemble and designs an algorithm to generate potential operation paths, aggregating reports within the cluster through constructed graphs. Our method, validated on 5115 reports, employs a clustering ensemble and graph aggregation, improving the clustering V-measure to 0.722. It also reduces the annotation time per report by 39. 3%, thereby improving the quality of the tagging. Source code available at <span><span>https://github.com/Boomnana/Text-Image-Fusion</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"228 ","pages":"Article 112478"},"PeriodicalIF":3.7,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143931497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient adaptive test case selection for DNNs robustness enhancement 增强深度神经网络鲁棒性的有效自适应测试用例选择
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-09 DOI: 10.1016/j.jss.2025.112451
Zhiyi Zhang , Huanze Meng , Yuchen Ding , Shuxian Chen , Yongming Yao
{"title":"Efficient adaptive test case selection for DNNs robustness enhancement","authors":"Zhiyi Zhang ,&nbsp;Huanze Meng ,&nbsp;Yuchen Ding ,&nbsp;Shuxian Chen ,&nbsp;Yongming Yao","doi":"10.1016/j.jss.2025.112451","DOIUrl":"10.1016/j.jss.2025.112451","url":null,"abstract":"<div><div>Deep neural networks (DNNs) have been widely used in various fields, and testing for DNN-based software has become increasingly important. To discover potential faults in DNNs, a large number of test cases and their corresponding labels are required. However, labeling so many test cases consumes enormous costs. Although there have been many test case selection techniques for DNN models, these techniques still have problems such as high overhead, low efficiency, and poor diversity. To address this problem, this paper proposes an efficient adaptive test case selection method based on the principle of uniform distribution of test cases called EATS. Based on the idea of adaptive testing, EATS combines the uncertainty of the model and the diversity of faults to calculate the distance of test cases and sort them, then gives priority to test cases with a higher probability of causing faults. We conduct experiments on four popular datasets and four representative DNN models. Experiment results show that, compared with the existing eight methods, EATS performs better in uniformity of test case distribution, diversity of errors found, model optimization, and optimization efficiency.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"229 ","pages":"Article 112451"},"PeriodicalIF":3.7,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Code beauty is in the eye of the beholder: Exploring the relation between code beauty and quality 代码美在旁观者的眼中:探讨代码美与质量的关系
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-08 DOI: 10.1016/j.jss.2025.112494
Theodoros Maikantis , Ilianna Natsiou , Christina Volioti , Elvira-Maria Arvanitou , Apostolos Ampatzoglou , Nikolaos Mittas , Alexander Chatzigeorgiou , Stelios Xinogalos
{"title":"Code beauty is in the eye of the beholder: Exploring the relation between code beauty and quality","authors":"Theodoros Maikantis ,&nbsp;Ilianna Natsiou ,&nbsp;Christina Volioti ,&nbsp;Elvira-Maria Arvanitou ,&nbsp;Apostolos Ampatzoglou ,&nbsp;Nikolaos Mittas ,&nbsp;Alexander Chatzigeorgiou ,&nbsp;Stelios Xinogalos","doi":"10.1016/j.jss.2025.112494","DOIUrl":"10.1016/j.jss.2025.112494","url":null,"abstract":"<div><div>Software artifacts and source code are often viewed as pure technical constructs aiming primarily at delivering specific functionality to the end users. However, almost each line of a computer program is the result of software engineer’s craftsmanship and thus reflects their skills and capabilities, but also their aesthetic view of how code should be written. Additionally, by nature, the code is not an artifact that is managed by a single person: the code is peer-reviewed, in some cases programmed in pairs, or maintained by different people. In this respect, the first impression for the quality of a code is usually a matter of “<em>reading</em>” the “<em>beauty</em>” of the code and then diving into the details of the actual implementation. This “<em>first-look</em>” impression can psychologically bias the software engineers, either positively or negatively and affect their evaluation. In this article we propose a novel code beauty model (accompanied with metrics) and empirically explore: (a) if different software engineers perceive code beauty in the same way; (b) if the proposed code beauty metrics are correlated to the perceived code beauty by individual software engineers; and (c) if code beauty metrics are correlated to software maintainability. The results of the study suggest: (a) that code beauty is highly subjective and different software engineers perceive a code chunk as beautiful or not in an inconsistent way; (b) that some code beauty metrics can be considered as correlated to maintainability; and therefore, the “<em>first-look</em>” impression might to some extent be representative of the quality of the reviewed code chunk.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"229 ","pages":"Article 112494"},"PeriodicalIF":3.7,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143946674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coding style matters: Scalable and efficient identification of memory management functions in monolithic firmware 编码风格很重要:在单片固件中可扩展和有效地识别内存管理功能
IF 3.7 2区 计算机科学
Journal of Systems and Software Pub Date : 2025-05-05 DOI: 10.1016/j.jss.2025.112472
Ruijie Cai , Zhaowei Zhang , Xiaoya Zhu , Yongguang Zhang , Xiaokang Yin , Shengli Liu
{"title":"Coding style matters: Scalable and efficient identification of memory management functions in monolithic firmware","authors":"Ruijie Cai ,&nbsp;Zhaowei Zhang ,&nbsp;Xiaoya Zhu ,&nbsp;Yongguang Zhang ,&nbsp;Xiaokang Yin ,&nbsp;Shengli Liu","doi":"10.1016/j.jss.2025.112472","DOIUrl":"10.1016/j.jss.2025.112472","url":null,"abstract":"<div><div>The occurrence of memory corruption vulnerabilities is often closely associated with improper use or implementation of memory management functions. Monolithic firmware typically uses custom memory management functions and lacks information such as function names, which poses significant challenges for vulnerability detection. Therefore, it is crucial for the identification of memory management functions. Existing methods are rendered ineffective due to the absence of metadata, and the diversity in implementation across different firmware images further complicates the identification process. To address the above problem, we introduce MemIdent, a new method leveraging the coding style inherent in identifying memory management functions. MemIdent is engineered to be scalable and efficient, capable of discerning consistent call features across various compiler optimizations and instruction architectures. It leverages three key observations derived from an in-depth analysis of monolithic firmware: the regularity in memory allocation calls, the co-occurrence of allocation and deallocation functions, and the statistical prominence of these features. MemIdent extracts features of call site such as function parameter types and return values using data flow analysis, which are then analyzed through statistical patterns to identify memory allocation and deallocation functions. We evaluate MemIdent’s performance using 44 firmware images covering 6 vendors (i.e., Tenda, Cisco, SonicWall, D-Link, TP-Link, and Comtech) across 3 architectures (MIPS, ARM, and PPC). The experimental results demonstrate that MemIdent has higher accuracy, greater efficiency, and better generality than state-of-the-art (SOTA) approaches, including Heapster, IDA Lumina, and MLM, which offers a significant advancement in memory management function identification methods for monolithic firmware.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"228 ","pages":"Article 112472"},"PeriodicalIF":3.7,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143927457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信