{"title":"Out of Control: Igniting SCADA investigations with an HMI forensics framework and the ignition forensics artifact carving tool (IFACT)","authors":"LaSean Salmon , Ibrahim Baggili","doi":"10.1016/j.fsidi.2025.301933","DOIUrl":"10.1016/j.fsidi.2025.301933","url":null,"abstract":"<div><div>In the modern industrial landscape, Programmable Logic Controllers (PLCs) and Supervisory Control and Data Acquisition (SCADA) systems serve as critical components in the automation and control of various industrial processes. While their widespread availability and overall efficiency are crucial, the increasing integration of these systems with networked environments has exposed them to a growing array of cyber threats. Meanwhile, the rapid growth and deployment of SCADA systems worldwide pose increasing challenges to managing their security effectively. We explore the value of HMI-focused digital forensics within SCADA environments, emphasizing the unique challenges in their evaluation and the information contained in digital artifacts. We present a comprehensive forensic analysis of Ignition: a popular SCADA software platform developed by Inductive Automation. We also develop a generic forensic analysis framework that can be used when conducting a forensic investigation on an HMI environment. Our investigative process is supported with the creation of IFACT: an HMI Forensic Analysis Tool created to streamline the process of parsing system information presented in Ignition HMI-sourced forensic data. The data recovered from memory, network, and disk forensic investigations provides insight into the state of the SCADA system, including tag and PLC utilization and configurations. Using IFACT, we investigate how long this data persists in volatile memory and how its lifetime is variable.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301933"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144748996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bridging knowledge gaps in digital forensics using unsupervised explainable AI","authors":"Zainab Khalid , Farkhund Iqbal , Mohd Saqib","doi":"10.1016/j.fsidi.2025.301924","DOIUrl":"10.1016/j.fsidi.2025.301924","url":null,"abstract":"<div><div>Artificial Intelligence (AI) has found multi-faceted applications in critical sectors including Digital Forensics (DF) which also require eXplainability (XAI) as a non-negotiable for its applicability, such as admissibility of expert evidence in the court of law. The state-of-the-art XAI workflows focus more on utilizing XAI tools for supervised learning. This is in contrast to the fact that unsupervised learning may be practically more relevant in DF and other sectors that largely produce complex and unlabeled data continuously, in considerable volumes. This research study explores the challenges and utility of unsupervised learning-based XAI for DF's complex datasets. A memory forensics-based case scenario is implemented to detect anomalies and cluster obfuscated malware using the Isolation Forest, Autoencoder, K-means, DBSCAN, and Gaussian Mixture Model (GMM) unsupervised algorithms on three categorical levels. The CIC MalMemAnalysis-2022 dataset's binary, and multivariate (4, 16) categories are used as a reference to perform clustering. The anomaly detection and clustering results are evaluated using accuracy, confusion matrices and Adjusted Rand Index (ARI) and explained through Shapley Additive Explanations (SHAP), using force, waterfall, scatter, summary, and bar plots' local and global explanations. We also explore how some SHAP explanations may be used for dimensionality reduction.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301924"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144749087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"If at first you don't succeed, trie, trie again: Correcting TLSH scalability claims for large-dataset malware forensics","authors":"Jordi Gonzalez","doi":"10.1016/j.fsidi.2025.301922","DOIUrl":"10.1016/j.fsidi.2025.301922","url":null,"abstract":"<div><div>Malware analysts use Trend Micro Locality-Sensitive Hashing (TLSH) for malware similarity computation, nearest-neighbor search, and related tasks like clustering and family classification. Although TLSH scales better than many alternatives, technical limitations have limited its application to larger datasets. Using the Lean 4 proof assistant, I formalized bounds on the properties of TLSH most relevant to its scalability and identified flaws in prior TLSH nearest-neighbor search algorithms. I leveraged these formal results to design correct acceleration structures for TLSH nearest-neighbor queries. On typical analyst workloads, these structures performed one to two orders of magnitude faster than the prior state-of-the-art, allowing analysts to use datasets at least an order of magnitude larger than what was previously feasible with the same computational resources. I make all code and data publicly available.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301922"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144749083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing DFIR in orchestration Environments: Real-time forensic framework with eBPF for windows","authors":"Philgeun Jin , Namjun Kim , Doowon Jeong","doi":"10.1016/j.fsidi.2025.301923","DOIUrl":"10.1016/j.fsidi.2025.301923","url":null,"abstract":"<div><div>Digital forensic investigations in Windows orchestration environments face critical challenges, including the ephemeral nature of containers, dynamic scaling, and limited visibility into low-level system events. Traditional event log-based approaches often fail to capture essential kernel-level artifacts such as process creation, file I/O, and registry modifications. To overcome these limitations, this paper introduces a novel DFIR framework that leverages eBPF to enable real-time kernel-level monitoring in containerized environments. Building on Microsoft's Windows eBPF project, we developed custom eBPF extensions tailored for DFIR. Aligned with NIST SP 800-61 guidelines, the proposed framework integrates unified workflows for preparation, detection, containment, and recovery through a centralized management console. Through case studies of cryptocurrency mining, ransomware, and blue screen of death attacks, we demonstrate our framework's ability to identify malicious processes that traditional event log-based methods might miss, while confirming minimal system overhead and high compatibility with existing orchestration platforms.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301923"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144749086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jieon Kim, Byeongchan Jeong, Seungeun Park, Sangjin Lee, Jungheum Park
{"title":"Your forensic AI-assistant, SERENA: Systematic extraction and reconstruction for enhanced A2P message forensics","authors":"Jieon Kim, Byeongchan Jeong, Seungeun Park, Sangjin Lee, Jungheum Park","doi":"10.1016/j.fsidi.2025.301931","DOIUrl":"10.1016/j.fsidi.2025.301931","url":null,"abstract":"<div><div>The integration of physical and online activities in today's hyper-connected world has blurred previously distinct boundaries. Online actions such as reservations, payments, and logins generate application-to-person (A2P) messages, which serve as valuable datasets for tracking user behavior. Although A2P messages from different service providers may vary in structure, the information within each message can be systematically normalized based on user behavior and service characteristics. However, traditional forensic tools have been unable to effectively identify and extract such forensically valuable information from these A2P messages. In this study, we leverage large language models (LLMs) combined with prompt engineering to analyze A2P messages from multiple service providers, addressing the limitations of existing forensic tools in extracting meaningful insights from unstructured or semi-structured text stored in messages and emails. The proposed methodology employs A2P messages to elaborately reconstruct user activity, enabling digital forensic investigations to identify case-relevant information with enhanced efficiency and accuracy.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301931"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144748994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Bitcoin simulation model and address heuristic method","authors":"Yanan Gong, Kam Pui Chow, Siu Ming Yiu","doi":"10.1016/j.fsidi.2025.301935","DOIUrl":"10.1016/j.fsidi.2025.301935","url":null,"abstract":"<div><div>Cryptocurrency-related crimes are on the rise and have a wide-ranging impact across various areas. To effectively combat and prevent such crimes, cryptocurrency forensics, which relies on blockchain analysis, is essential. Despite advancements in Bitcoin de-anonymization techniques, several challenges persist. The absence of authentic data labels introduces uncertainty in de-anonymization results, especially in the context of address clustering. This issue is further compounded by the development of privacy-enhancing technologies that obscure address linkages, thus undermining the reliability of outcomes as forensic evidence. To address these limitations, this study focuses on Bitcoin blockchain analysis and the improvement of address clustering. Specifically, the work presents an enhanced simulation model designed to accurately simulate real Bitcoin transactions, offering a stable platform for evaluating address clustering algorithms that utilize transaction details, thereby facilitating the assessment of the admissibility of clustering results. Meanwhile, we introduce a new heuristic algorithm aimed at identifying one-time change addresses, with experimental results demonstrating that it achieves more precise clustering outcomes than existing heuristic methods. Furthermore, our blockchain analysis reveals overarching patterns and recent changes in the Bitcoin blockchain, particularly following the introduction of the BRC-20 token.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301935"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144748998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mahfuzul I. Nissan , James Wagner , Alexander Rasin
{"title":"ANOC: Automated NoSQL database carver","authors":"Mahfuzul I. Nissan , James Wagner , Alexander Rasin","doi":"10.1016/j.fsidi.2025.301929","DOIUrl":"10.1016/j.fsidi.2025.301929","url":null,"abstract":"<div><div>The increased use of NoSQL databases to store and manage data has led to a demand to include them in forensic investigations. Most NoSQL databases use diverse storage formats compared to file carving and relational database forensics. For example, some NoSQL databases manage key-value pairs using B-Trees, while others maintain hash tables or even binary protocols for serialization. Current research on NoSQL carving focuses on single-database solutions, making it impractical to develop individual carvers for every NoSQL system. This necessitates a generalized approach to forensic recovery, enabling the creation of a unified carver that can operate effectively across various NoSQL platforms.</div><div>In this research, we introduce Automated NoSQL Carver, <span>ANOC</span>, a novel tool designed to reconstruct database contents from raw database images without relying on the database API or logs. <span>ANOC</span> adapts to the unique storage characteristics of various NoSQL systems, utilizing byte-level reverse engineering to identify and parse data structures. By analyzing storage layouts algorithmically, <span>ANOC</span> identifies and reconstructs key-value pairs, hierarchical storage structures, and associated metadata across multiple NoSQL platforms.</div><div>Through extensive experimentation, we demonstrate <span>ANOC</span>'s ability to recover data from four representative key-value store NoSQL databases: Berkeley DB, ZODB, etcd, and LMDB. We explore <span>ANOC</span>'s limitations in environments where data is corrupted and RAM snapshots. Our findings establish the feasibility of a generalized carver capable of addressing the challenges posed by the diverse and evolving NoSQL ecosystem.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301929"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144749092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging memory forensics to investigate and detect illegal 3D printing activities","authors":"Hala Ali , Andrew Case , Irfan Ahmed","doi":"10.1016/j.fsidi.2025.301925","DOIUrl":"10.1016/j.fsidi.2025.301925","url":null,"abstract":"<div><div>As 3D printing is widely adopted across critical sectors, malicious users exploit this technology to produce illegal tools for criminal activities. The increasing availability of affordable 3D printers and the limitations of current regulations highlight the urgent need for robust forensic capabilities. While existing research focuses on the physical forensics of printed objects, the digital aspects of 3D printing forensics remain underexplored, resulting in a significant investigative gap. This paper introduces <em>SliceSnap</em>, a novel memory forensics framework that analyzes the volatile memory of slicing software, which is essential for converting 3D models into printer-executable G-code instructions. Our investigation focuses on Ultimaker Cura, the most popular Python-based slicing tool. By leveraging the Python garbage collector and conducting structural analysis of its objects, <em>SliceSnap</em> can extract the mesh data of 3D models, G-code instructions, slicing settings, detailed 3D printer metadata, and logging information. Given the potential for slicing software compromises, our framework extends beyond artifact extraction to include the complementary analysis tool, <em>G-parser</em>. This tool detects malicious G-code manipulations by finding the discrepancies between the original settings and those extracted from the G-code. Evaluation results demonstrated the effectiveness of <em>SliceSnap</em> in recovering design files and G-code of various criminal tools, such as firearms and TSA master keys, with 100% accuracy, in addition to providing detailed information about the slicing software and 3D printer. The evaluation also analyzed the temporal persistence of memory artifacts across critical stages of Cura's lifecycle. Moreover, through <em>G-parser</em>, the framework successfully detected the G-code manipulations conducted by our novel attack vector that targets G-code during inter-process communication within the software. Implemented as Volatility 3 plugins, <em>SliceSnap</em> provides investigators with automated capabilities to detect 3D printing-related criminal activities.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301925"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144749088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlo Jakobs, Axel Mahr, Martin Lambertz, Mariia Rybalka, Daniel Plohmann
{"title":"Bytewise approximate matching: Evaluating common scenarios for executable files","authors":"Carlo Jakobs, Axel Mahr, Martin Lambertz, Mariia Rybalka, Daniel Plohmann","doi":"10.1016/j.fsidi.2025.301927","DOIUrl":"10.1016/j.fsidi.2025.301927","url":null,"abstract":"<div><div>This research explores the application of bytewise approximate matching algorithms on executable files, evaluating the effectiveness of ssdeep, sdhash, TLSH, and MRSHv2 across various scenarios, where approximate matching seems to be a natural tool to employ. Previous works already underlined that approximate matching is often used for tasks where the algorithms have not been thoroughly and systematically evaluated. Pagani et al. (2018), in particular, highlighted the shortcomings of previous research and tried to improve current knowledge about the applicability of approximate matching in the context of executable files by evaluating typical use cases. We extend their work by taking a closer look at further common scenarios that are not covered in their article. Specifically, we examine use cases such as different versions of the same software and comparisons between on-disk and in-memory representations of the same program, both for malicious and benign software.</div><div>Our findings reveal that the considered algorithms’ performance across all evaluated scenarios was generally unsatisfactory. Notably, they struggle with size-related and localized modifications introduced during the loading stage. Furthermore, executables with no functional similarity may be mismatched due to shared byte-level similarity caused by embedded resources or inherent to certain programming languages or runtime environments. Consequently, these algorithms should be used cautiously and regarded as assisting tools rather than reliable methods for indicating similarity between executable files, as both false positives and false negatives can occur, and users should be aware of them.</div><div>Moreover, while some of the unfavored results stem from design decisions, we observed unexpected behavior in some experiments that we could trace back to issues in the reference implementations of the algorithms. After fixing the implementations, the strange effects in our results indeed disappeared. It is still an open question if and to what extent previous experiments and evaluations were affected by these issues.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"53 ","pages":"Article 301927"},"PeriodicalIF":2.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144749090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}