PaSSw0rdVib3s!: AI-assisted password recognition for digital forensic investigations

IF 2.2 4区医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Forensic Science International-Digital Investigation Pub Date : 2025-03-01 DOI:10.1016/j.fsidi.2025.301870

Romke van Dijk , Judith van de Wetering , Ranieri Argentini , Leonie Gorka , Anne Fleur van Luenen , Sieds Minnema , Edwin Rijgersberg , Mattijs Ugen , Zoltán Ádám Mann , Zeno Geradts

{"title":"PaSSw0rdVib3s!: AI-assisted password recognition for digital forensic investigations","authors":"Romke van Dijk , Judith van de Wetering , Ranieri Argentini , Leonie Gorka , Anne Fleur van Luenen , Sieds Minnema , Edwin Rijgersberg , Mattijs Ugen , Zoltán Ádám Mann , Zeno Geradts","doi":"10.1016/j.fsidi.2025.301870","DOIUrl":null,"url":null,"abstract":"<div><div>In digital forensic investigations, the ability to identify passwords in cleartext within digital evidence is often essential for the acquisition of data from encrypted devices. Passwords may be stored in cleartext, knowingly or accidentally, in various locations within a device, e.g., in text messages, notes, or system log files. Finding those passwords is a challenging task, as devices typically contain a substantial amount and a wide variety of textual data. This paper explores the performance of several different types of machine learning models trained to distinguish passwords from non-passwords, and ranks them according to their likelihood of being a human-generated password. Three deep learning models (PassGPT, CodeBERT and DistilBERT) were fine-tuned, and two traditional machine learning models (a feature-based XGBoost and a TF/IDF-based XGBoost) were trained. These were compared to the existing state-of-the-art technology, a password recognition model based on probabilistic context-free grammars. Our research shows that the fine-tuned PassGPT model outperforms the other models. We show that the combination of multiple different types of training datasets, carefully chosen based on the context, is needed to achieve good results. In particular, it is important to train not only on dictionary words and leaked credentials, but also on data scraped from chats and websites. Our approach was evaluated with realistic hardware that could fit inside an investigator's workstation. The evaluation was conducted on the publicly available RockYou and MyHeritage leaks, but also on a dataset derived from real casework, showing that these innovations can indeed be used in a real forensic context.</div></div>","PeriodicalId":48481,"journal":{"name":"Forensic Science International-Digital Investigation","volume":"52 ","pages":"Article 301870"},"PeriodicalIF":2.2000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Science International-Digital Investigation","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666281725000095","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In digital forensic investigations, the ability to identify passwords in cleartext within digital evidence is often essential for the acquisition of data from encrypted devices. Passwords may be stored in cleartext, knowingly or accidentally, in various locations within a device, e.g., in text messages, notes, or system log files. Finding those passwords is a challenging task, as devices typically contain a substantial amount and a wide variety of textual data. This paper explores the performance of several different types of machine learning models trained to distinguish passwords from non-passwords, and ranks them according to their likelihood of being a human-generated password. Three deep learning models (PassGPT, CodeBERT and DistilBERT) were fine-tuned, and two traditional machine learning models (a feature-based XGBoost and a TF/IDF-based XGBoost) were trained. These were compared to the existing state-of-the-art technology, a password recognition model based on probabilistic context-free grammars. Our research shows that the fine-tuned PassGPT model outperforms the other models. We show that the combination of multiple different types of training datasets, carefully chosen based on the context, is needed to achieve good results. In particular, it is important to train not only on dictionary words and leaked credentials, but also on data scraped from chats and websites. Our approach was evaluated with realistic hardware that could fit inside an investigator's workstation. The evaluation was conducted on the publicly available RockYou and MyHeritage leaks, but also on a dataset derived from real casework, showing that these innovations can indeed be used in a real forensic context.

查看原文本刊更多论文

PaSSw0rdVib3s !：用于数字取证调查的人工智能辅助密码识别

在数字取证调查中，识别数字证据中的明文密码的能力对于从加密设备获取数据通常是必不可少的。密码可能有意或无意地以明文形式存储在设备内的不同位置，例如，在文本消息、笔记或系统日志文件中。查找这些密码是一项具有挑战性的任务，因为设备通常包含大量和各种各样的文本数据。本文探讨了几种不同类型的机器学习模型的性能，这些模型被训练来区分密码和非密码，并根据它们作为人类生成密码的可能性对它们进行排名。对三个深度学习模型（PassGPT、CodeBERT和DistilBERT）进行了微调，并训练了两个传统机器学习模型（基于特征的XGBoost和基于TF/ idf的XGBoost）。将这些与现有的最先进的技术进行比较，该技术是一种基于概率上下文无关语法的密码识别模型。我们的研究表明，经过微调的PassGPT模型优于其他模型。我们表明，需要根据上下文精心选择多个不同类型的训练数据集的组合，才能获得良好的结果。尤其重要的是，不仅要训练字典中的单词和泄露的凭证，还要训练从聊天记录和网站上抓取的数据。我们的方法被评估与现实的硬件，可以适合调查员的工作站。评估是基于公开的RockYou和MyHeritage泄露的信息，但也基于来自真实案例的数据集，表明这些创新确实可以用于真实的法医环境。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊