基于管理程序的数据合成：在法证图像生成中解决客户端代理残留问题的潜力

IF 2 4区医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Forensic Science International-Digital Investigation Pub Date : 2024-03-01 DOI:10.1016/j.fsidi.2023.301690

Dennis Wolf , Thomas Göbel , Harald Baier

{"title":"基于管理程序的数据合成：在法证图像生成中解决客户端代理残留问题的潜力","authors":"Dennis Wolf , Thomas Göbel , Harald Baier","doi":"10.1016/j.fsidi.2023.301690","DOIUrl":null,"url":null,"abstract":"<div><p>In the field of digital forensics, the number and heterogeneity of devices typically involved in an investigation is increasing. In order to train digital forensics practitioners and make faster progress in the development and validation of forensic tools, the demand for up-to-date data sets is high. However, manually creating data sets is a complex, tedious, and time-consuming task increasing the need for automated solutions. Existing data generation frameworks typically use components that run directly on the simulated client (e.g., a client-side agent controlled via SSH). On the one hand, this facilitates simulation by providing direct feedback from the client and the ability to use client-side libraries to access software. On the other hand, however, this approach creates unintended traces in the generated data sets that quickly reveal their synthetic origin and affect their realism and thus their relevance. To avoid such traces, this paper presents a hypervisor-based solution to eliminate such a client-side software component in a recent digital forensic data set generator, while compensating for its absence only through host-side means. To demonstrate the practicability of the proposed approach as well as the indistinguishability of the generated traces, a multi-participant scenario is performed as a proof of concept to replicate a realistic attack scenario on a Linux system from a Kali attacker machine. During the evaluation, the generated data set is compared in terms of unintended traces and realism to a data set generated by the same framework using an agent component. In this way, we demonstrate the benefits and overall usefulness of an agent-less data synthesis approach.</p></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"48 ","pages":"Article 301690"},"PeriodicalIF":2.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666281723002093/pdfft?md5=e999660e34e9dfdd4cd9e4ea9eab250e&pid=1-s2.0-S2666281723002093-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Hypervisor-based data synthesis: On its potential to tackle the curse of client-side agent remnants in forensic image generation\",\"authors\":\"Dennis Wolf , Thomas Göbel , Harald Baier\",\"doi\":\"10.1016/j.fsidi.2023.301690\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In the field of digital forensics, the number and heterogeneity of devices typically involved in an investigation is increasing. In order to train digital forensics practitioners and make faster progress in the development and validation of forensic tools, the demand for up-to-date data sets is high. However, manually creating data sets is a complex, tedious, and time-consuming task increasing the need for automated solutions. Existing data generation frameworks typically use components that run directly on the simulated client (e.g., a client-side agent controlled via SSH). On the one hand, this facilitates simulation by providing direct feedback from the client and the ability to use client-side libraries to access software. On the other hand, however, this approach creates unintended traces in the generated data sets that quickly reveal their synthetic origin and affect their realism and thus their relevance. To avoid such traces, this paper presents a hypervisor-based solution to eliminate such a client-side software component in a recent digital forensic data set generator, while compensating for its absence only through host-side means. To demonstrate the practicability of the proposed approach as well as the indistinguishability of the generated traces, a multi-participant scenario is performed as a proof of concept to replicate a realistic attack scenario on a Linux system from a Kali attacker machine. During the evaluation, the generated data set is compared in terms of unintended traces and realism to a data set generated by the same framework using an agent component. In this way, we demonstrate the benefits and overall usefulness of an agent-less data synthesis approach.</p></div>\",\"PeriodicalId\":48481,\"journal\":{\"name\":\"Forensic Science International-Digital Investigation\",\"volume\":\"48 \",\"pages\":\"Article 301690\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666281723002093/pdfft?md5=e999660e34e9dfdd4cd9e4ea9eab250e&pid=1-s2.0-S2666281723002093-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Forensic Science International-Digital Investigation\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666281723002093\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Science International-Digital Investigation","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666281723002093","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

在数字取证领域，调查通常涉及的设备的数量和异质性都在不断增加。为了培训数字取证从业人员，加快取证工具的开发和验证进度，对最新数据集的需求很高。然而，手动创建数据集是一项复杂、乏味且耗时的任务，因此对自动化解决方案的需求日益增加。现有的数据生成框架通常使用直接在模拟客户端上运行的组件（例如，通过 SSH 控制的客户端代理）。一方面，这种方法能提供来自客户端的直接反馈，并能使用客户端库访问软件，从而为仿真提供便利。但另一方面，这种方法会在生成的数据集中产生意想不到的痕迹，这些痕迹会很快暴露其合成来源，影响其真实性，从而影响其相关性。为了避免这种痕迹，本文提出了一种基于管理程序的解决方案，在最近的数字取证数据集生成器中消除了客户端软件组件，同时仅通过主机端手段对其缺失进行补偿。为了证明所提方法的实用性以及生成痕迹的无差别性，本文以一个多人参与的场景作为概念验证，在 Linux 系统上复制了一个来自 Kali 攻击者机器的真实攻击场景。在评估过程中，我们将生成的数据集与使用代理组件的同一框架生成的数据集在意外痕迹和真实性方面进行了比较。通过这种方式，我们展示了无代理数据合成方法的优势和整体实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hypervisor-based data synthesis: On its potential to tackle the curse of client-side agent remnants in forensic image generation

In the field of digital forensics, the number and heterogeneity of devices typically involved in an investigation is increasing. In order to train digital forensics practitioners and make faster progress in the development and validation of forensic tools, the demand for up-to-date data sets is high. However, manually creating data sets is a complex, tedious, and time-consuming task increasing the need for automated solutions. Existing data generation frameworks typically use components that run directly on the simulated client (e.g., a client-side agent controlled via SSH). On the one hand, this facilitates simulation by providing direct feedback from the client and the ability to use client-side libraries to access software. On the other hand, however, this approach creates unintended traces in the generated data sets that quickly reveal their synthetic origin and affect their realism and thus their relevance. To avoid such traces, this paper presents a hypervisor-based solution to eliminate such a client-side software component in a recent digital forensic data set generator, while compensating for its absence only through host-side means. To demonstrate the practicability of the proposed approach as well as the indistinguishability of the generated traces, a multi-participant scenario is performed as a proof of concept to replicate a realistic attack scenario on a Linux system from a Kali attacker machine. During the evaluation, the generated data set is compared in terms of unintended traces and realism to a data set generated by the same framework using an agent component. In this way, we demonstrate the benefits and overall usefulness of an agent-less data synthesis approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Forensic Science International-Digital Investigation Multiple-

CiteScore

5.90

自引率

15.00%

发文量

审稿时长

76 days